Large table cleanup with minimal locks

Question

Large table cleanup with minimal locks

Edward.Polley 76944

SSC Eights!

Points: 834
More actions
March 5, 2012 at 9:35 pm

#257081

Comments posted to this topic are about the item Large table cleanup with minimal locks

Viewing 15 posts - 1 through 15 (of 54 total)

You must be logged in to reply to this topic. Login to reply

app1dak Newbie Points: 6 More actions · Answer 1

Is the IN Clause in the WHERE predicate faster than using the WHERE EXISTS clause? I've heard that large IN Clauses tend to be slow performers. I work mostly with DB2 on iSeries though. Just starting using SQL Server for some projects.

jay.dunk SSC Veteran Points: 298 More actions · Answer 2

would a table variable not be better than a # table?

Kalman SSC Journeyman Points: 93 More actions · Answer 3

Hi

I think There is much better solution (very simple to write and should work faster) - this by using CTE . You can select TOP X rows in the CTE and then delete it , this will be done in a loop .

If You want I can give an example .

Sincerely

Kalman

jay.dunk SSC Veteran Points: 298 More actions · Answer 4

@Kalman --- that wont get round the contention, if you use the method shown in the article you can throttle the deletes to avoid locking for too long a time, whilst only perform one select per large batch to get the ids you want.

I do something similar on a 24/7 db, 365 days a year with all files on a single raid 5 partition!?! I use the same approach but with a table variable and I process over 1.5 million deletes per day on a rolling basis.

The cte will keep repeating teh select for the delete and therefore likely cause locking for the duration of the select/delete.

okbangas SSChampion Points: 11773 More actions · Answer 5

A few thoughts: Why are you concerned about locking of the temp tables, they're only accessible from your own session.

Furthermore, option maxdop 8 does allow for parallelism yes, but it is generally not recommended to span NUMA nodes, which maxdop 8 may do on dual quad servers. I do think maxdop 8 can potentially cause more performance issues than maxdop 4 in some cases, and this SHOULD be configured correectly as a server option by the dba.

Ole Kristian Velstadbråten Bangås - Virinco - Facebook - Twitter

Concatenating Row Values in Transact-SQL[/url]

heino.zunzer Old Hand Points: 371 More actions · Answer 6

Well, several issues with this, but the foremost issue is with using

where ... in (select...)

Please don't. You already have your IDs in a table and a Primary Key (hopefully clustered) on SalesID in your application table. So just join the tables!

DELETE

YourDB.dbo.SalesZIP

from

YourDB.dbo.SalesZIP orig

inner join #InnerTemp dead on orig.SalesID = dead.SalesID

jack 63767 SSC Enthusiast Points: 116 More actions · Answer 7

hi,

if you want to use a join in this case, be careful and define a unique primary key constraint on your temp table(heap!). Joins on heaps are potential inefficient operations.

ciao,

jack

jay.dunk SSC Veteran Points: 298 More actions · Answer 8

Still nailing down best fit on t-sql, but if you dont need any additional info from the 2nd table why join especially if not one to one? Surely just checking for a match will be quicker than actioning a join? hash join against merge join in query plan i would have thought?

ChrisTaylor SSCarpal Tunnel Points: 4527 More actions · Answer 9

jay.dunk (3/6/2012)
I do something similar on a 24/7 db, 365 days a year with all files on a single raid 5 partition!?! I use the same approach but with a table variable and I process over 1.5 million deletes per day on a rolling basis.

Be careful with using a table variable with a large amount of rows because a table variable is optimized for one row, by SQL Server i.e. it assumes 1 row will be returned therefore possibly producing a sub-optimal execution plan.

I agree with the comments regarding using the inner join as opposed to WHERE...IN (SELECT...)

_________________________________________________________________________________SQLGeordieWeb:- Jarrin ConsultancyBlog:- www.chrisjarrintaylor.co.ukTwitter:- @SQLGeordie

paul.phillips Grasshopper Points: 22 More actions · Answer 10

paul.phillips

Grasshopper

Points: 22

March 6, 2012 at 6:19 am

#1455249

How would this apply to a heap?

Cheers,

Paul

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 11

ChrisTaylor (3/6/2012)
jay.dunk (3/6/2012)
I do something similar on a 24/7 db, 365 days a year with all files on a single raid 5 partition!?! I use the same approach but with a table variable and I process over 1.5 million deletes per day on a rolling basis.
Be careful with using a table variable with a large amount of rows because a table variable is optimized for one row, by SQL Server i.e. it assumes 1 row will be returned therefore possibly producing a sub-optimal execution plan.
I agree with the comments regarding using the inner join as opposed to WHERE...IN (SELECT...)

Your statement is incorrect - if you have a reference that says otherwise please provide it.

In this particular case a table variable will likely be best, because you WANT an index seek delete on the actual table(s). You also want to minimize the recompiles that will come from adding/truncating a temp table.

Another suboptimal thing in this article is the non-clustered indexes. They serve no purpose in the queries and impose the significant overhead of building them every loop. Just have a heap table and go about the deletes.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

chrisfradenburg SSCrazy Eights Points: 9592 More actions · Answer 12

TheSQLGuru (3/6/2012)
Another suboptimal thing in this article is the non-clustered indexes. They serve no purpose in the queries and impose the significant overhead of building them every loop. Just have a heap table and go about the deletes.

Or, if the indexes are helpful, cluster the temp table so you're not doubling up.

ChrisTaylor SSCarpal Tunnel Points: 4527 More actions · Answer 13

Your statement is incorrect - if you have a reference that says otherwise please provide it.
In this particular case a table variable will likely be best, because you WANT an index seek delete on the actual table(s). You also want to minimize the recompiles that will come from adding/truncating a temp table.

Not quite sure about references but if you do a quick test of inserting 1000 records into a table variable and do a select from it, the estimated number of rows is always 1.....

_________________________________________________________________________________SQLGeordieWeb:- Jarrin ConsultancyBlog:- www.chrisjarrintaylor.co.ukTwitter:- @SQLGeordie

maroon-78 Ten Centuries Points: 1014 More actions · Answer 14

maroon-78

Ten Centuries

Points: 1014

March 6, 2012 at 8:40 am

#1455373

can't index a table variable