Uneven timings when deleting by chunks

Question

Uneven timings when deleting by chunks

btio_3000

SSCrazy

Points: 2662
More actions
September 10, 2015 at 3:22 pm

#307609

Hi,
*** There is a table like this:
CREATE dbo.Table1
(
ID int Primary Key Clustered,
...
LastDT datetime Not Null
)
CREATE NONCLUSTERED INDEX IX_Table1_LastDT ON dbo.Table1(LastDT)
*** This table has about 1,000,000 records
*** No FK relations whatsoever, both ways, no triggers either
*** I am running a delete statement like below:
DECLARE @30DaysAgo DATETIME = GETDATE() - 30
select NULL
while @@rowcount > 0 begin
--get start time here
DELETE TOP(10000)
FROM dbo.Table1
WHERE LastDT < @30DaysAgo
--get end time and write to logging table here
end
This runs and completes fine (deleting about 150,000 records),
but logging shows that some iterations take, say, 1 or 2 sec, some take 30 to 40 sec,
and some take up to 200 sec to delete same amount of 10,000 records
Any advise/suggestions/wisdom?
Thanks!

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 1

Most likely you are running this while users are active on the system, so you are probably encountering blocking during some of the runs through the loop. I would actually expect that to occur as that is one reason to do deletes in chunks, to allow users access to the table while old data is archived/deleted.

btio_3000 SSCrazy Points: 2662 More actions · Answer 2

Hi Lynn,

That's probably the case... so, basically there is no way to avoid it, except for such common sense things like keeping transactions short, read uncommitted, update statistics, right? No magic tricks?

Thanks!

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 3

btio_3000 (9/10/2015)
Hi Lynn,
That's probably the case... so, basically there is no way to avoid it, except for such common sense things like keeping transactions short, read uncommitted, update statistics, right? No magic tricks?
Thanks!

I would stay away for read uncommitted. It can cause a lot of other issues.

Kristen-173977 SSCrazy Eights Points: 8687 More actions · Answer 4

btio_3000 (9/10/2015)
No magic tricks?

Dunno if it is a magic trick but I find that the "WHERE MyDateColumn < @CutoffDate" doesn't perform well in a delete loop. The time to find the next batch of records, to be deleted, is longer than I am comfortable with (maybe it has something to do with the index pages changing as the rows are deleted?)

So we do

INSERT INTO #TempTable

SELECT ClusteredKey1, ClusteredKey2, ...

FROM MyTable

WHERE MyDateColumn < @CutoffDate

ORDER BY ClusteredKey1, ClusteredKey2, ...

#TempTable is created with an IDENTITY to make it easier to manage the block size within the delete loop

SELECT @intRowCount = 1, @intID = 1 -- Force first loop iteration

, @StartTime = GetDate()

WHILE @intRowCount > 0

BEGIN

DELETE D

FROM #TempTable AS T

JOIN MyTable AS D

ON D.ClusteredKey1 = T.ClusteredKey1

AND D.ClusteredKey2 = T.ClusteredKey2

...

WHERE T.ID BETWEEN @intID AND @intID + @BatchSize

SELECT @intRowCount = @@ROWCOUNT, @intID = @intID + @BatchSize

IF DATEDIFF(second, @StartTime, GetDate()) > @MaxLoopTime

SELECT @BatchSize = @BatchSize / 2

ELSE

SELECT @BatchSize = @BatchSize + @BatchSizeIncreaseAmount

WAITFOR ... for a short delay to allow other processes access ...

END

Even so we find that delete loop elapsed times are lumpy - even when we think noone else is using the server ...

(The @BatchSize handling in this example is off-by-one, our actual code is a bit more hairy than this!)

btio_3000 SSCrazy Points: 2662 More actions · Answer 5

Thank you Kristen,

Not sure if deleting by FROM-TO is faster than by using TOP in my case...

Thanks All once again

Kristen-173977 SSCrazy Eights Points: 8687 More actions · Answer 6

You would have to test it 🙂 but the TOP has to re-perform the query to get the next batch, whereas the Temp Table FROM-TO Range should be using clustered index to locate the rows to be deleted and, assuming the temp table is sorted on clustered key [in the table to be deleted, not the Temp Table's ID range!] then all keys being deleted should be "close" within the physical file which I hope makes the delete process slicker. Still got to dash around all over the disk deleting any associated non-clustered index entries though ...