September 15, 2010 at 11:34 am
Hi,
This question is from my friend. Every day, millions of row need to be inserted into a couple of tables and then process, and then delete/truncate tables, finally do a backup, like db.bak
But when he open backup file using Notepad, the data truncated still are inside backup files.
His question, what can he do to remove records from backup?
BTW, I tested and the records which I have truncated from table are really inside backup files.
Thank you!
September 15, 2010 at 12:42 pm
I'm... not sure. But, hazarding a guess, it's got to do with Truncate simply doing page deallocations instead of actually wiping data. Now, why backup is actually picking up and storing deallocated pages is beyond me.
You might try a shrink prior to the backup, that should actually clear the deallocated pages.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
September 15, 2010 at 12:44 pm
I am quite surprised. I've never heard of anyone who could open a backup file and read it in notepad. Where did you learn how to trace the page allocations? How did you determine that the rows were still present? It could be that the space on the page you're looking at contains data, but the space itself has been marked as reusable so the data isn't actually accessible through DML.
The follow-on question would be 'Why do you care, if you can't get to them through the DB engine'? Is it the space you're worried about?
September 15, 2010 at 12:47 pm
Only a restore of the .bak file will tell you whats contained in .bak file, opening it in notepad is just not relevant.
---------------------------------------------------------------------
September 15, 2010 at 12:49 pm
David Webb-200187 (9/15/2010)
I am quite surprised. I've never heard of anyone who could open a backup file and read it in notepad. Where did you learn how to trace the page allocations? How did you determine that the rows were still present? It could be that the space on the page you're looking at contains data, but the space itself has been marked as reusable so the data isn't actually accessible through DML.The follow-on question would be 'Why do you care, if you can't get to them through the DB engine'? Is it the space you're worried about?
Thank you for your reply.
I inserted 10000 records with the same special value "Gaighghghghgh" into a test table and then truncate, backup the database. open backup .bak, found "Gaighghghghgh".
I don't care this question, just want to ask why.
September 15, 2010 at 12:52 pm
Craig Farrell (9/15/2010)
You might try a shrink prior to the backup, that should actually clear the deallocated pages.
I am going to try to test shrink the file. let you know the result.
September 15, 2010 at 12:58 pm
It's very probable that what you are seeing is data left on a page which has been marked as re-usable, but is part of a larger allocation unit (extent) which cannot yet be deallocated.
Another reason to encrypt your backups.
September 15, 2010 at 1:03 pm
I would hazard an educated guess that the pages are merely marked for reuse, not actually overwritten with binary zeros, which would obviously really slow up any delete operations.
I think that data would only be removed if more than an extent was affected (64k). How big was this table with data in it, I would think 10000 X 'Gaighghghghgh' takes up less than 64k?
do some testing for us, get the table size, delete instead of truncate, drop the table.
all the above is conjecture on my part.
---------------------------------------------------------------------
September 15, 2010 at 1:05 pm
looks like me and david agree 🙂
---------------------------------------------------------------------
September 15, 2010 at 1:07 pm
I just tested that insert into 1000000 records to test table (test1) and insert 100 records into another table (Test2).
Then truncate both. backup database.
From Backup file,Test1 was less than 1000000 records left and there is no records in Test2
September 15, 2010 at 1:17 pm
Delete table, even though drop table, part of records are still in that page(extent).
From testing, does that mean "encrypt bak" is the only way to hide that part of records?
September 15, 2010 at 3:00 pm
I would say that is the only way to be 100% sure yes.
a backup file can be password protected but i don't know the effect of that on opening through notepad
---------------------------------------------------------------------
September 15, 2010 at 9:19 pm
The PASSWORD option in the BACKUP command simply requires that you provide a password during the restore process. It does nothing to obfuscate the contents of the backed up data.
SQL BAK Explorer - read SQL Server backup file details without SQL Server.
Supports backup files created with SQL Server 2005 up to SQL Server 2017.
September 16, 2010 at 10:07 am
Ray Mond (9/15/2010)
The PASSWORD option in the BACKUP command simply requires that you provide a password during the restore process. It does nothing to obfuscate the contents of the backed up data.
I know but what happens if you try and open such a backup in notepad, presumably you can but have not tried it
---------------------------------------------------------------------
Viewing 14 posts - 1 through 13 (of 13 total)
You must be logged in to reply to this topic. Login to reply