December 2, 2013 at 1:38 pm
I have a DB that grows like crazy. This was a design decision I had nothing to do with, but now I have to manage the space this thing takes up. I need to archive the records from that DB, and have business rules to do so, anything over 90 days old can be archived.
I created a script that goes like this:
Simple description, without all bells and whistles:
While
Try
Select records from Audit into Temp table
Insert records from Temp table into Archive table
Delete records from Audit table
Drop temp table
End Try
Catch
@Error = ltrim(str(error_number()))
If error = 0
Log to a table how many records we moved.
Commit Tran
Continue
Else
Rollback Tran
Break from loop.
End Catch
Outside of loop, Send email if there is a problem, @error <> 0.
In development, running interactively it runs beautifully, and logs what it's doing just fine. When I run it in a job it works differently. In a job it does not log what it's doing to the table as it moves records, and it bunches all of the batched into on transaction, or multiple nested transactions. I suspect the nested transactions because I see the rollback slowly putting the records back into the prod table at the same batch size as they were taken out. This script should only roll back the batch it is in the middle of when the script errored or was killed.
Anyway, I'm trying to figure out what I am doing wrong here while. My goal was performance and data integrity without imposing on the app that's inserting all of the records I have to archive, such as by using With(NoLock).
Here is the full code:
Set NOCOUNT ON
declare @enddate datetime, @NumRecordsToGet int, @FirstDayOfQuarter datetime
Declare @NumRecords Int, @NumAffected Int
Declare @Err Int, @ErrMsg VarChar(255), @Body1 VarChar(500), @Email VarChar(2000)
select @enddate = convert(datetime,convert(varchar,dateadd(month,-3,getdate()),101)), @FirstDayOfQuarter = DATEADD(qq, DATEDIFF(qq,0,GetDate()), 0)
If datediff(day,@FirstDayOfQuarter,GetDate()) > 5 --don't archive for the first 5 days of each quarter; in case something goes wrong with the backup
Begin
While (select logtimestamp from ACSLog.dbo.auditlog where auditlogid = (select min(auditlogid) from ACSLog.dbo.auditlog)) < @EndDate
Begin
Begin Try
If Object_ID('TempDB..#archive') > 0
Drop Table #archive
Select Top 1 @NumRecordsToGet = NumAuditRecords, @Email = ErrorEmail from ACSLog_Archive.dbo.PurgeLimits
Begin Tran
Select Top (@NumRecordsToGet) * into #Archive from ACSLog.dbo.auditlog With (NoLock) Where logtimestamp < @enddate --Populate Temp table.
Insert Into ACSLog_Archive.dbo.AuditLog --Put records into archive table where they do not exist in archive table.
Select A.* from #Archive A
Left Join ACSLog_Archive.dbo.AuditLog B On A.AuditLogID = B.AuditLogID
Where B.AuditLogID Is Null
Delete A
From ACSLog.dbo.AuditLog A With (Rowlock)
Inner Join ACSLog_Archive.dbo.AuditLog B On A.AuditLogID = B.AuditLogID
Drop table #Archive;
End Try
BEGIN CATCH
Set @Err = ltrim(str(error_number()))
Set @ErrMsg = error_message()
--PRINT 'ErrNo: ' + @Err
--PRINT 'ErrMsg: ' + @ErrMsg
If @ERR = 0 --Check for errors and if none, Commit the transaction.
Begin
Insert Into ACSLog_Archive.dbo.PURGE_StatusLog (ArchivedCount, TableName)
Values (@NumAffected, 'AuditLog');
Commit Tran
Continue
End
Else
Begin
RollBack Tran
Break --out of while loop.
End
END CATCH
End
If @Err > 0
Begin
Set @Body1 = Convert(NVarChar(10), @Err) + ': ' + @ErrMsg
EXEC msdb.dbo.sp_send_dbmail
@recipients = @Email,
@body = @Body1,
@subject = 'An error occurred Archiving the AuditLog table.' ;
End
End
Thanks,
Chris
Learning something new on every visit to SSC. Hoping to pass it on to someone else.
December 2, 2013 at 6:54 pm
why don't you try the SQL Server Partitioning instead of trying to manually implement such a thing?
If I am understanding correctly, you only are doing this because the table is growing and you need a better performance?
December 2, 2013 at 9:48 pm
Table partitioning would be better solution for this, but be aware that Portioning is enterprise only feature.
December 3, 2013 at 5:47 am
Where I need the better performance is in my archival process. I have to work around the inserts from Biztalk, to log to three different tables, at a rate of about 200 transactions per second. The tables have 75M to 200M records and I am trying to keep the DB from overflowing the disk space on this server. I created the script you see with the ability to adjust the batch sizes so I could tune it easily, on the fly, as I have to not overload the log as well.
The goal is to have each batch of records moved inside a transaction, so that no records are lost if there is any kind of issue when archiving that batch.
During testing I did not encounter more than on open transaction left over when I stopped the script before it was completed. It seemed to either commit or rollback all others and only be in the middle on one tran when I would stop the script at a random moment.
The other thing this is supposed to allow, for performance purposes, is the batch and transaction completes and commits or rolls back, and releases all locks so the other traffic, the inserts, can continue without hindrance, until the next delete from the archive process comes along. The reason I stopped this script yesterday was because of all of the locks it was holding, and stepping on the other processes doing inserts. After stopping the Agent job I noticed it rolling back for a long period of time, and that the rollback was putting records back into the Prod table at a rate of 1000 records at a time, the same size as the current batch setting. This is what prompted me to create this thread.
Thanks,
Chris
Learning something new on every visit to SSC. Hoping to pass it on to someone else.
December 3, 2013 at 6:17 am
As a minor suggestion, you could try wrapping entire script in a single named transaction and then continue using another named transaction for the "work". You could commit the "work" transaction or rollback as necessary. But the "work" transaction must be named. If not the whole batch gets rolled back.
I believe you are a victim of the "I didn't know it was wrapping the statement in an implied transaction" issue. I have been a victim several times in the past. The above suggestion has helped me out several times.
EDIT: From 2008R2 BOL: If a ROLLBACK WORK or ROLLBACK TRANSACTION statement without a transaction_name parameter is executed at any level of a set of nested transaction, it rolls back all of the nested transactions, including the outermost transaction.
December 3, 2013 at 10:02 am
The quick and very-dirty-in-this-case "solution" is to put a COMMIT TRANSACTION immediately before the END TRY.
But the whole process overall could be a lot cleaner and faster, particularly if:
auditlogid is the clustering key for the table, and continually ascending (such as an identity); and
logtimestamp is from the original insert, and so will also always ascend in conjunction with the auditlogid.
Are both of those true?
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
December 4, 2013 at 7:36 am
ScottPletcher, yes, AuditLogID is the clustered Index and an Identity column. I am always interested in better, faster ways to make things happen, if I can maintain data integrity.
I have figured it out, thanks to Google and an example from guys from 'rola. Turns out I was confused about how the Catch works, in that that code inside the Catch never gets executed unless there is a problem, so my Commit being inside the catch meant it would never execute unless there was an error, and even then, if there was an error and the Catch code was executed, my Error value would be greater than 0 and STILL not execute the Commit.
I changed it around so that the commit is in the Try and the rollback is in the catch, like this
:
While
Try
Select records from Audit into Temp table
Insert records from Temp table into Archive table
Delete records from Audit table
Drop temp table
Log to a table how many records we moved.
Commit Tran
End Try
Catch
@Error = ltrim(str(error_number()))
Rollback Tran
Break from loop.
End Catch
Outside of loop, Send email if there is a problem, @error <> 0.
I must have tested on such a small number of records that this problem slipped by me. :blush:
Thanks for the replies.
Chris
Learning something new on every visit to SSC. Hoping to pass it on to someone else.
December 4, 2013 at 7:57 am
Stamey (12/4/2013)
I have figured it out, thanks to Google and an example from guys from 'rola. ... I changed it around so that the commit is in the Try and the rollback is in the catch
Chris
Sorry, I had very limited time for my post, but I still believe I pointed that out too, as the first line in my previous post :-).
The only reason I said it is "dirty" is that the entire code could be cleaned up to be much more efficient. But at least you have a working solution until that can be done.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
December 4, 2013 at 9:09 am
Yes, your post did say that. In my case though, I found the other yesterday and didn't get back to this post until this morning. Either way, thanks.
If you would like to expand on your explanation regarding better performance, I'm all ears.
Here is the latest revision.
Set NOCOUNT ON
declare @enddate datetime, @NumRecordsToGet int, @FirstDayOfQuarter datetime
Declare @NumRecords Int, @NumAffected Int
Declare @Err Int, @ErrMsg VarChar(255), @Body1 VarChar(500), @Email VarChar(2000)
select @enddate = convert(datetime,convert(varchar,dateadd(month,-3,getdate()),101)), @FirstDayOfQuarter = DATEADD(qq, DATEDIFF(qq,0,GetDate()), 0)
If datediff(day,@FirstDayOfQuarter,GetDate()) > 5 --don't archive for the first 5 days of each quarter; in case something goes wrong with the backup
Begin
While (select logtimestamp from ACSLog.dbo.auditlog where auditlogid = (select min(auditlogid) from ACSLog.dbo.auditlog)) < @EndDate
Begin
Begin Try
If Object_ID('TempDB..#archive') > 0
Drop Table #archive
Select Top 1 @NumRecordsToGet = NumAuditRecords, @Email = ErrorEmail from ACSLog_Archive.dbo.PurgeLimits
Begin Tran
Select Top (@NumRecordsToGet) * into #Archive from ACSLog.dbo.auditlog With (NoLock) Where logtimestamp < @enddate --Populate Temp table.
Insert Into ACSLog_Archive.dbo.AuditLog --Put records into archive table where they do not exist in archive table.
Select A.* from #Archive A
Left Join ACSLog_Archive.dbo.AuditLog B On A.AuditLogID = B.AuditLogID
Where B.AuditLogID Is Null
Select @NumAffected = @@RowCount
Delete A
From ACSLog.dbo.AuditLog A With (Rowlock)
Inner Join ACSLog_Archive.dbo.AuditLog B On A.AuditLogID = B.AuditLogID
Drop table #Archive;
Insert Into ACSLog_Archive.dbo.PURGE_StatusLog (ArchivedCount, TableName)
Values (@NumAffected, 'AuditLog');
Set @NumAffected = 0 --Reset for next loop.
Commit Tran
End Try
BEGIN CATCH
IF @@TRANCOUNT > 0
RollBack Tran
Set @Err = ltrim(str(error_number()))
Set @ErrMsg = error_message()
PRINT 'ErrNo: ' + @Err
PRINT 'ErrMsg: ' + @ErrMsg
Break --out of while loop.
END CATCH
--Print 'TranCount: ' + Convert(VarChar(5), @@TranCount)
End
If @Err > 0
Begin
Set @Body1 = Convert(NVarChar(10), @Err) + ': ' + @ErrMsg
EXEC msdb.dbo.sp_send_dbmail
@recipients = @Email,
@body = @Body1,
@subject = 'An error occurred Archiving the AuditLog table.' ;
End
End
Thanks,
Chris
Learning something new on every visit to SSC. Hoping to pass it on to someone else.
December 4, 2013 at 9:58 am
Need to clarify whether the other part of my initial impressions was true or not:
Is logtimestamp from the original insert, and so will also always ascend in conjunction with the auditlogid? I'm hoping logtimestamp defaults to GETDATE() and so will always ascend with the auditlogid.
If it doesn't, you might need to build a separate index on ( logtimestamp, auditlogid ) for best performance, and the code might need written slightly differently for best performance.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
December 4, 2013 at 2:46 pm
Sorry, extremely busy, but here's the re-write. I couldn't test it, of course, so you'll have to do that ;-):
Set NOCOUNT ON
If Object_ID('tempdb..#auditlogidsToArchive') > 0
Drop Table #auditlogidsToArchive
Create table #auditlogidsToArchive ( auditlogid int primary key )
Declare @enddate datetime, @FirstDayOfQuarter datetime
Declare @NumRecordsToGet int, @NumRowsArchived Int
Declare @Err Int, @ErrMsg VarChar(255), @Body1 VarChar(500), @Email VarChar(2000)
select @enddate = convert(datetime,convert(varchar,dateadd(month,-3,getdate()),101)), @FirstDayOfQuarter = DATEADD(qq, DATEDIFF(qq,0,GetDate()), 0)
If datediff(day,@FirstDayOfQuarter,GetDate()) > 5 --don't archive for the first 5 days of each quarter; in case something goes wrong with the backup
Begin
Select Top (1) @NumRecordsToGet = NumAuditRecords, @Email = ErrorEmail from ACSLog_Archive.dbo.PurgeLimits
-- get list of auditlogids that need archived.
-- NOTE: this query will perform *vastly better* with an index on logtimestamp.
insert into #auditlogidsToArchive ( auditlogid )
select A.auditlogid
from ACSLog.dbo.auditlog A with (nolock)
where
A.logtimestamp < @EndDate and
not exists(select 1 from ACSLog_Archive.dbo.AuditLog arch where arch.auditlogid = A.auditlogid)
order by A.auditlogid
While exists(select top (1) * from #auditlogidsToArchive)
Begin
Begin Try
Begin Tran
Insert Into ACSLog_Archive.dbo.AuditLog
Select A.*
From (
select top (@NumRecordsToGet) *
from #auditlogidsToArchive
order by auditlogid
) AS arch_needed
Inner join ACSLog.dbo.auditlog as A with (nolock) on
a.auditlogid = arch_needed.auditlogid
Set @NumRowsArchived = @@ROWCOUNT
Delete A
From ACSLog.dbo.AuditLog A
Inner join (
select top (@NumRecordsToGet) auditlogid
from #auditlogidsToArchive
order by auditlogid
) AS archived on
A.auditlogid = archived.auditlogid
Delete arch
from #auditlogidsToArchive arch
Inner join (
select top (@NumRecordsToGet) auditlogid
from #auditlogidsToArchive
order by auditlogid
) AS archived on
arch.auditlogid = archived.auditlogid
Insert Into ACSLog_Archive.dbo.PURGE_StatusLog (ArchivedCount, TableName)
Values (@NumRowsArchived, 'AuditLog');
Commit Tran
End Try
BEGIN CATCH
IF @@TRANCOUNT > 0
RollBack Tran
Set @Err = ltrim(str(error_number()))
Set @ErrMsg = error_message()
PRINT 'ErrNo: ' + @Err
PRINT 'ErrMsg: ' + @ErrMsg
Break --out of while loop.
END CATCH
--Print 'TranCount: ' + Convert(VarChar(5), @@TranCount)
End
If @Err > 0
Begin
Set @Body1 = Convert(NVarChar(10), @Err) + ': ' + @ErrMsg
EXEC msdb.dbo.sp_send_dbmail
@recipients = @Email,
@body = @Body1,
@subject = 'An error occurred Archiving the AuditLog table.' ;
End
End
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply