September 10, 2015 at 3:22 pm
Hi,
*** There is a table like this:
CREATE dbo.Table1
(
ID int Primary Key Clustered,
...
LastDT datetime Not Null
)
CREATE NONCLUSTERED INDEX IX_Table1_LastDT ON dbo.Table1(LastDT)
*** This table has about 1,000,000 records
*** No FK relations whatsoever, both ways, no triggers either
*** I am running a delete statement like below:
DECLARE @30DaysAgo DATETIME = GETDATE() - 30
select NULL
while @@rowcount > 0 begin
--get start time here
DELETE TOP(10000)
FROM dbo.Table1
WHERE LastDT < @30DaysAgo
--get end time and write to logging table here
end
This runs and completes fine (deleting about 150,000 records),
but logging shows that some iterations take, say, 1 or 2 sec, some take 30 to 40 sec,
and some take up to 200 sec to delete same amount of 10,000 records
Any advise/suggestions/wisdom?
Thanks!
September 10, 2015 at 3:37 pm
Most likely you are running this while users are active on the system, so you are probably encountering blocking during some of the runs through the loop. I would actually expect that to occur as that is one reason to do deletes in chunks, to allow users access to the table while old data is archived/deleted.
September 10, 2015 at 3:56 pm
Hi Lynn,
That's probably the case... so, basically there is no way to avoid it, except for such common sense things like keeping transactions short, read uncommitted, update statistics, right? No magic tricks?
Thanks!
September 11, 2015 at 8:25 am
btio_3000 (9/10/2015)
Hi Lynn,That's probably the case... so, basically there is no way to avoid it, except for such common sense things like keeping transactions short, read uncommitted, update statistics, right? No magic tricks?
Thanks!
I would stay away for read uncommitted. It can cause a lot of other issues.
September 11, 2015 at 8:41 am
btio_3000 (9/10/2015)
No magic tricks?
Dunno if it is a magic trick but I find that the "WHERE MyDateColumn < @CutoffDate" doesn't perform well in a delete loop. The time to find the next batch of records, to be deleted, is longer than I am comfortable with (maybe it has something to do with the index pages changing as the rows are deleted?)
So we do
INSERT INTO #TempTable
SELECT ClusteredKey1, ClusteredKey2, ...
FROM MyTable
WHERE MyDateColumn < @CutoffDate
ORDER BY ClusteredKey1, ClusteredKey2, ...
#TempTable is created with an IDENTITY to make it easier to manage the block size within the delete loop
SELECT @intRowCount = 1, @intID = 1 -- Force first loop iteration
, @StartTime = GetDate()
WHILE @intRowCount > 0
BEGIN
DELETE D
FROM #TempTable AS T
JOIN MyTable AS D
ON D.ClusteredKey1 = T.ClusteredKey1
AND D.ClusteredKey2 = T.ClusteredKey2
...
WHERE T.ID BETWEEN @intID AND @intID + @BatchSize
SELECT @intRowCount = @@ROWCOUNT, @intID = @intID + @BatchSize
IF DATEDIFF(second, @StartTime, GetDate()) > @MaxLoopTime
SELECT @BatchSize = @BatchSize / 2
ELSE
SELECT @BatchSize = @BatchSize + @BatchSizeIncreaseAmount
WAITFOR ... for a short delay to allow other processes access ...
END
Even so we find that delete loop elapsed times are lumpy - even when we think noone else is using the server ...
(The @BatchSize handling in this example is off-by-one, our actual code is a bit more hairy than this!)
September 14, 2015 at 11:22 am
Thank you Kristen,
Not sure if deleting by FROM-TO is faster than by using TOP in my case...
Thanks All once again
September 14, 2015 at 1:08 pm
You would have to test it 🙂 but the TOP has to re-perform the query to get the next batch, whereas the Temp Table FROM-TO Range should be using clustered index to locate the rows to be deleted and, assuming the temp table is sorted on clustered key [in the table to be deleted, not the Temp Table's ID range!] then all keys being deleted should be "close" within the physical file which I hope makes the delete process slicker. Still got to dash around all over the disk deleting any associated non-clustered index entries though ...
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply