Record Delete is slowing server

Question

Record Delete is slowing server

JokerUK

SSC Veteran

Points: 288
More actions
April 23, 2012 at 8:15 am

#392087

Hi folks,
I have a daily process that deletes all but the last 28 days of data
The table holds on average 387,758,627 records at any one time and (obviously adds records daily) So to maintain performance we delete anything older than the 28th day.
This process runs for about 12 minutes at 2am each morning, but during those 12 minutes we often get time-outs from other processes that are trying to use other tables in the database.
We are running on 32 cores of CPU with 128GB RAM and many TB of disk but I wonder if my SQL delete is not written in the best way:
###########################################################
--update liv_css_dp_counts
-- in order to keep 28 days of data we need to delete the data for one 28 days + 1 day
-- This must run and complete before 12:00 every day
declare @today as int
declare @tomorrow as int
select @today = DATEPART(day,getdate())
select @tomorrow = DATEPART(day,dateadd(day,1,GETDATE()))
declare @loopvar as int
select @loopvar= (@today+1) % 32
while @loopvar % 32 <> (@tomorrow+1) %32
begin
delete from liv_css_dp_counts where liv_css_dp_counts.day=@loopvar %32
select @loopvar=@loopvar+1
end
####################################################################
The table liv_css_dp_counts is partitioned on the day column so in theory is spread over 28 tables
There is only one key on the table which is *not* the day field.
Any ideas anyone ?
Kind Regards
Steve

Viewing 15 posts - 1 through 15 (of 22 total)

You must be logged in to reply to this topic. Login to reply

Sean Lange SSC Guru Points: 286590 More actions · Answer 1

Yes that type of delete is going to perform horribly, especially with that much data. There is no need for a loop here at all. This should be a single delete statement. Without ddl it is hard to say what it should look like. You have a column named day? Do you have a column with a datetime datatype?

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

JokerUK SSC Veteran Points: 288 More actions · Answer 2

Thanks for replying.

The table is:

[id] [bigint] IDENTITY(1,1) NOT NULL,

[PostCode] [varchar](8) NULL,

[DeliveryPoint] [varchar](2) NULL,

[Quantity] [int] NULL,

[css_session_id] [int] NOT NULL,

[day] [int] NULL,

[month] [int] NULL

The only key is the css_session_id which is used to get the data from this table and link it to the parent records.

The date is on the parent record - 1 parent record has around 5-8k child records in this table.

But if I use the parent record to control the delete it takes 10 times the amount of time to complete due to the overhead of managing the relationship (I assume)

JokerUK SSC Veteran Points: 288 More actions · Answer 3

sorry should have added that the loop is to catch those months where there are more than 28 days

Sean Lange SSC Guru Points: 286590 More actions · Answer 4

Why have day and month as columns with a datatype of int? Can you replace those with a datetime column, which could be the same value as the parent table? Then your delete is trivial. I think you are suffering the wrath of not using proper datatypes.

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

Sean Lange SSC Guru Points: 286590 More actions · Answer 5

Another idea would be to use cascading deletes. You said that is slower? My guess is due to indexes.

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

JokerUK SSC Veteran Points: 288 More actions · Answer 6

OK, I can add a date into the row, and I will see how that goes.

I figured that looking at an INT when deleting would be more efficient than working with a function on a date.

But I really am quite new to SQL so bow to your greater expertise.

It will probably take a couple of days to get the change through but I'll post back the results.

Steve

Sean Lange SSC Guru Points: 286590 More actions · Answer 7

JokerUK (4/23/2012)
OK, I can add a date into the row, and I will see how that goes.
I figured that looking at an INT when deleting would be more efficient than working with a function on a date.
But I really am quite new to SQL so bow to your greater expertise.
It will probably take a couple of days to get the change through but I'll post back the results.
Steve

LOL no bowing around here...and certainly not to me.

If you can add the same datetime as the parent then your delete becomes a single delete.

delete from liv_css_dp_counts

where NewDateTimeCol < DATEADD(d, -28, getdate())

No need for loops at all. Give it a shot and see if that doesn't make things a lot easier to code and quite a bit faster than looping.

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

Johan Bijnens SSC Guru Points: 135376 More actions · Answer 8

just my 2ct.

If your single transaction delete takes to long, a loop delete may be appropriate for your case.

declare @tranrows int

declare @rwcount int

set @tranrows = 50000

while 0=0

begin

begin tran

delete top ( @tranrows )

from liv_css_dp_counts

where NewDateTimeCol < DATEADD(d, -28, getdate())

set @rwcount = @@rowcount

commit tran

if @rwcount < @tranrows BREAK

end

This way, you'll have way more smaller transactions, but less blocking to ongoing operation.

Test it, Test IT

Make sure your filtering column is indexed !

Johan

Learn to play, play to learn !

Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:

- How to post Performance Problems
- How to post data/code to get the best help[/url]

- How to prevent a sore throat after hours of presenting ppt

press F1 for solution, press shift+F1 for urgent solution 😀

Need a bit of Powershell? How about this

Who am I ? Sometimes this is me but most of the time this is me

JokerUK SSC Veteran Points: 288 More actions · Answer 9

OK,

I added a datefield and retro-populated exisiting records with the date from the parent record.

I also modified the data-loading app so that new records are populated with the date.

The good news is that this mornings delete did not cause any timeouts - I will need to monitor for a few days as the timeouts did not happen every day.

Interestingly through the delete process has taken slightly longer - it went from 12 mins yesterday to 14 mins today.

I will look to add the reduced transaction /smaller batch delete over the next day if the problem does not go away.

Cheers all

Steve

Grant Fritchey SSC Guru Points: 398909 More actions · Answer 10

Now that you have a batched process, the right way to go, you need to worry about how the optimizer is handling it. Do you have an index on the column that is filtering for the delete statement? What kind of foreign key constraints are on this table? All those foreign key constraints will be checked as part of the delete, so you might need indexes there as well. If you want advise in that area, post the execution plan.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

JokerUK SSC Veteran Points: 288 More actions · Answer 11

Thanks for that response,

I think that I've attached the execution plan

Kind Regards

Steve

Grant Fritchey SSC Guru Points: 398909 More actions · Answer 12

Based on that plan you're still doing a loop.

But, based on the plan, you're getting a table scan for each delete. The predicate is here: [MIS].[dbo].[LIV_CSS_DP_Counts].[day]=[@todelete]

That suggests that an index on the [day] column might be helpful. Also, unless I'm misreading the plan, this is a heap table? No clustered index? Unless you have a good reason, I'd suggest adding a clustered index to the table. Most tables, most of the time, benefit from a clustered index. It's just an improved storage mechanism within SQL Server.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Eugene Elutin SSC Guru Points: 59322 More actions · Answer 13

Grant Fritchey (4/24/2012)
...Also, unless I'm misreading the plan, this is a heap table? No clustered index? Unless you have a good reason, I'd suggest adding a clustered index to the table. Most tables, most of the time, benefit from a clustered index. It's just an improved storage mechanism within SQL Server.

Delete from heap table doesn't reclaim any space. You better to add a clustered index to this table as suggested by Grant.

_____________________________________________
"The only true wisdom is in knowing you know nothing"
"O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
(So many miracle inventions provided by MS to us...)

How to post your question to get the best and quick help[/url]

JokerUK SSC Veteran Points: 288 More actions · Answer 14

Thanks folks,

Just to be clear, the delete is not about reclaiming space, it's about removing data to keep a manageable amount online for reporting The table has about 330M rows for 28 days data and the deletes average about 6-10 million rows per day.

The Table is partitioned on the 'day' field which is an integer rather than date.

I can add a clustered index to the table - that is no problem, the net aim is to get rid of the time-outs for other transactions on the server when the delete process is executing.

I really do appreciate your help and time on this.

Steve