April 10, 2009 at 7:54 am
I am trying to put together an archive process for a poorly designed third-party app which has no provisions for pruning the DB.
The server is version 2039, and the DB is set to Simple recovery. The log starts out at about 37M and the dat afile is about 4G.
Out of concern for the log size, and my plan to delete over 2 million records, I put transactions in my script to form batches, per an idea I found on http://www.tek-tips.com/faqs.cfm?fid=3141.
My script, for one of the tables:
Set RowCount 1000
Go
Declare @rc Int
Declare @Total Int
--/*
Set @rc = 1000
Set @Total = 0
While @rc = 1000
Begin
Begin Tran
Insert Into att_evtARCHIVE (
id,
persid,
last_mod_dt,
obj_id,
event_tmpl,
start_time,
wait_time,
fire_time,
first_fire_time,
cancel_time,
num_fire,
status_flag,
user_smag,
group_name)
Select
id,
persid,
last_mod_dt,
obj_id,
event_tmpl,
start_time,
wait_time,
fire_time,
first_fire_time,
cancel_time,
num_fire,
status_flag,
user_smag,
group_name
From AHD.att_evt
Where DateAdd(ss, last_mod_dt, '12/30/1969 20:00:00') < '3-1-2009'
Delete A From AHD.att_evt A Inner Join att_evtARCHIVE AR On A.ID = AR.ID
Commit Tran
Set @rc = @@RowCount
Set @Total = @Total + @@RowCount
Print @Total
End
The intended goal is to continuously delete 1000 records at a time (this may change to 10,000 or 100,000 if it ever works as it should) and make sure the change is committed so that the log can truncate like it should, or be backed up, to control log growth, until all of the intended records have been deleted.
When this script is run I find that while it does do the deletes in batches of 1000, the log continues to grow and does not truncate on checkpoint as it should. In my efforts, I stop the script from running and then issue a "Commit Tran" by itself to make sure I have not left any transaction open. Even then, I can't get the log to truncate even using "Backup Log DBName With Truncate_Only".
So far, I have been unable to get the log to truncate either automatically or manually, I have had the SPID running the script run away, in that even though I stopped it, supposedly successfully, in query analyzer, I still saw it show up in the Activity Monitor, and I was not able to kill it there, so I even restarted SQL Server. When I restarted SQL Server the DB went into recovery mode and stayed there for far too long, especially since all of the transactions were supposed to be committed already.
Fortunately I am working on this process in Dev, so I just restored another copy of Prod onto the Dev server to start again.
Note:, at this point I have found another way to accomplish the goal, copying the record I want to keep to another table, truncating the original table, then copying the records I want to keep back to the original table. The Full backup will be kept in case we need the deleted records. The ongoing archive process will never be deleting or copying this volume of records so I can write it a bit differently.
So now I am trying to learn what went wrong with my original idea, the idea I plan to use for the ongoing archival process.
Am I misunderstanding something about how the transactions work?
Why doesn't the tran log truncate, either manually or automatically?
Thanks,
Chris
Learning something new on every visit to SSC. Hoping to pass it on to someone else.
April 10, 2009 at 12:20 pm
When the database is in simple recovery model, the transaction log won't be truncated until a checkpoint occurs. So, just issuing the commit transaction will not truncate the log.
Add a checkpoint after the commit and it should work the way you are expecting.
Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”
― Charles R. Swindoll
How to post questions to get better answers faster
Managing Transaction Logs
April 10, 2009 at 1:27 pm
I did the checkpoint thing also, several times. It was so strange. The Tlog got over 4G and I abandoned the attempt, proceeding with a restore, since it would not recover.
Today I have been running it again, to figure out what's going on, armed with DBCC LogInfo and DBCC OpenTran. Today the tlog has gotten up to 461M, but has not grown any, even after running this script for over an hour. The Datafile has grown to 5.6G, but has stopped growing also.
Today it seems like it's working as it should. I am using the same DB backup I have been using, just restoring it again for the next attempt, as I work out the intricacies of this process.
The darn thing is taking forever, with deleting 10K rows at a time, but the log seems stable now.
I may try 100K on the next run to see how bad it affects the log.
Thanks,
Chris
Learning something new on every visit to SSC. Hoping to pass it on to someone else.
April 15, 2009 at 10:30 am
So I'm continuing to play with this archive process, copying a whole lot of records from the source table to a "ARCHIVE" suffixed table of the same name. All indexes are identical between the two tables. In the case of my script I am moving records from two tables now, to the tune of 2.3M and 800K+ records, and trying to do it with a single run of the script.
Last night, with the DB set to Simple recovery, and the batch size set to 20,000 per transaction (see method I am using in first post), I started the script and let it run to the end. This took 5:25:59, and reported no errors. When I got in this morning I found that even though the recovery was set to simple, the log was 12.1G and is 11.7G full. Isn't this space supposed to free up when a checkpoint happens? I did run a manual Checkpoint, and have checked that there are no open transactions on this DB. The DB is now in some state I do not understand. While it shows its status as online, "normal", queries such as Select Count or DBCC CheckDB just hang up and run forever without returning any results.
In an effort to try to see exactly what happened to the DB, I backed it up, which happened successfully, but very slowly, and the backupfile ended up being over 12G, surprising since the original data file states it is 4.4G of space.
I restored the DB to a SQL 2005 DB server and while it has reached 100%, the restore query still has not finished, almost 2 hours later at 1:53:07 now). SQL 2005 states the DB is "in recovery".
Deductions I am making now are: 1. the tran log did not get smaller because the transactions in it were NOT actually committed to the DB, 2. the backup also contains the uncommitted transactions from the log, and 3. The restored DB is working on either committing or rolling back the transactions in the log right now.
My questions for the world, if the above assumptions are correct: Why?
1. Why did the transactions NOT get committed to the DB?
2. Why wouldn't SQL Server reconcile the transactions and truncate the log, since it is set to Simple recovery?
3. Why won't even DBCC CheckDB work on that database now?
If my deductions above are NOT correct, what is going on?
Thanks,
Chris
Learning something new on every visit to SSC. Hoping to pass it on to someone else.
April 15, 2009 at 1:44 pm
After 2 hours, 53 minutes, the restore finally finished and now the transaction log is almost empty. Time for a shrinkfile.
The shrinkfile worked, on both data and log files. So now I'm looking at the post-mortem of this process. It appears that while the archive tables are out there, neither of them have any records in them, which suggests that SQL did a complete rollback of all of my transactions, and, the worst part, the data is corrupt. Now the larger of the 2 tables I was attempting to archive has even more records in it. It started out with 2,394,141, but now has 4,664,712 records in it. It's like the rollback actually added records to the DB.
Keep in mind this is currently a test DB and has no application talking to it.
Thanks,
Chris
Learning something new on every visit to SSC. Hoping to pass it on to someone else.
April 15, 2009 at 2:12 pm
Stamey (4/15/2009)
After 2 hours, 53 minutes, the restore finally finished and now the transaction log is almost empty. Time for a shrinkfile.The shrinkfile worked, on both data and log files. So now I'm looking at the post-mortem of this process. It appears that while the archive tables are out there, neither of them have any records in them, which suggests that SQL did a complete rollback of all of my transactions, and, the worst part, the data is corrupt. Now the larger of the 2 tables I was attempting to archive has even more records in it. It started out with 2,394,141, but now has 4,664,712 records in it. It's like the rollback actually added records to the DB.
Keep in mind this is currently a test DB and has no application talking to it.
Thanks,
Chris
If you have more rows in the source table - that tells me there was a process that inserted data into the source.
It does sound like you ran into a rollback scenario - that would explain the no rows in the destination after the restore, but that would mean the process was not completed when you took the backup or it had failed and was rolling back at that time.
As to your other questions - I really wouldn't know where to start without having access to the system to see for myself. Sorry.
Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”
― Charles R. Swindoll
How to post questions to get better answers faster
Managing Transaction Logs
April 16, 2009 at 5:35 am
This is what's so mysifying about the situation. It rolled back, even though the script completed with no error messages, and the script had specific "Commit Tran" statements in it, for every 20,000 records, to keep the log from getting too big.
The DB is a restore of Prod on a different server with a different name from Prod, so I can't imagine any other app having access to it to add records.
Thanks,
Chris
Learning something new on every visit to SSC. Hoping to pass it on to someone else.
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply