April 8, 2010 at 4:12 am
We've had a problem on a production system yesterday where it looks like a problem in a job running at 8:18am issued a BEGIN TRANSACTION that may not have had a corresponding COMMIT issued.
The job appears to have completed successfully, as do the subsequent 4 hours of batch running (all using SPID 68). We can see the batch finish at 12:31 and relinquish SPID 68. This SPID was then used by 3 other users in the next 5 minutes, the last of whom issued a ROLLBACK command, and this seems to have then rolled back all changes since 8:18 on this SPID.
We had a basic trace running which is how I can see this rollback, but I don't understand how this will have caused the issue, as the SPID had been handed off to a seperate user at this stage. Is it feasible for this rollback scenario to happen, or did SQL simply "tidy up" the open transaction when the batch relinquished SPID 68? If it did "tidy up" why is there no evidence of this in Server Logs?
I've been through the Server Logs and Windows event logs for the hours in question and can't find any trace of a rollback other than the one issued at 12:34 by a different user.
I'm now at the stage where to try and prove what happened one way or another I'm looking at restoring a backup from early yesterday morning and rolling the transaction logs in to try and pin down what happened and when. If it was an issue with the application I need to prove this so I can hand it back to the developers.
Are either of the scenarios I've outlined above feasible? If not, does anyone have any other ideas?
We're using SQL Server 2000 Enterprise Build 80.0.2055 running on a Windows 2003 64 bit Server, and the batch is controlled from a VB.NET job manager process running on a seperate web server.
Any help would be appreciated.
Andy
April 26, 2010 at 8:44 am
After 2 weeks of mining TLog backups and getting them back in from tape, I've finally managed to prove that the application did indeed leave an open transaction (as we all know it had), even down to the very second that it opened if.
Now I've just got to prove to all the developers that it doesn't matter how many time (or how loudly) they tell me that their code can't behave like that, that it did behave like that. Apparently just showing them the evidence in all the TLog files isn't enough...
Ho-hum.
Viewing 2 posts - 1 through 1 (of 1 total)
You must be logged in to reply to this topic. Login to reply