March 5, 2012 at 9:35 pm
Comments posted to this topic are about the item Large table cleanup with minimal locks
March 5, 2012 at 11:20 pm
Is the IN Clause in the WHERE predicate faster than using the WHERE EXISTS clause? I've heard that large IN Clauses tend to be slow performers. I work mostly with DB2 on iSeries though. Just starting using SQL Server for some projects.
March 6, 2012 at 1:49 am
would a table variable not be better than a # table?
March 6, 2012 at 2:12 am
Hi
I think There is much better solution (very simple to write and should work faster) - this by using CTE . You can select TOP X rows in the CTE and then delete it , this will be done in a loop .
If You want I can give an example .
Sincerely
Kalman
March 6, 2012 at 2:26 am
@Kalman --- that wont get round the contention, if you use the method shown in the article you can throttle the deletes to avoid locking for too long a time, whilst only perform one select per large batch to get the ids you want.
I do something similar on a 24/7 db, 365 days a year with all files on a single raid 5 partition!?! I use the same approach but with a table variable and I process over 1.5 million deletes per day on a rolling basis.
The cte will keep repeating teh select for the delete and therefore likely cause locking for the duration of the select/delete.
March 6, 2012 at 2:32 am
A few thoughts: Why are you concerned about locking of the temp tables, they're only accessible from your own session.
Furthermore, option maxdop 8 does allow for parallelism yes, but it is generally not recommended to span NUMA nodes, which maxdop 8 may do on dual quad servers. I do think maxdop 8 can potentially cause more performance issues than maxdop 4 in some cases, and this SHOULD be configured correectly as a server option by the dba.
March 6, 2012 at 3:55 am
Well, several issues with this, but the foremost issue is with using
where ... in (select...)
Please don't. You already have your IDs in a table and a Primary Key (hopefully clustered) on SalesID in your application table. So just join the tables!
DELETE
YourDB.dbo.SalesZIP
from
YourDB.dbo.SalesZIP orig
inner join #InnerTemp dead on orig.SalesID = dead.SalesID
March 6, 2012 at 4:24 am
hi,
if you want to use a join in this case, be careful and define a unique primary key constraint on your temp table(heap!). Joins on heaps are potential inefficient operations.
ciao,
jack
March 6, 2012 at 4:42 am
Still nailing down best fit on t-sql, but if you dont need any additional info from the 2nd table why join especially if not one to one? Surely just checking for a match will be quicker than actioning a join? hash join against merge join in query plan i would have thought?
March 6, 2012 at 5:43 am
jay.dunk (3/6/2012)
I do something similar on a 24/7 db, 365 days a year with all files on a single raid 5 partition!?! I use the same approach but with a table variable and I process over 1.5 million deletes per day on a rolling basis.
Be careful with using a table variable with a large amount of rows because a table variable is optimized for one row, by SQL Server i.e. it assumes 1 row will be returned therefore possibly producing a sub-optimal execution plan.
I agree with the comments regarding using the inner join as opposed to WHERE...IN (SELECT...)
_________________________________________________________________________________SQLGeordieWeb:- Jarrin ConsultancyBlog:- www.chrisjarrintaylor.co.ukTwitter:- @SQLGeordie
March 6, 2012 at 6:19 am
How would this apply to a heap?
Cheers,
Paul
March 6, 2012 at 7:48 am
ChrisTaylor (3/6/2012)
jay.dunk (3/6/2012)
I do something similar on a 24/7 db, 365 days a year with all files on a single raid 5 partition!?! I use the same approach but with a table variable and I process over 1.5 million deletes per day on a rolling basis.Be careful with using a table variable with a large amount of rows because a table variable is optimized for one row, by SQL Server i.e. it assumes 1 row will be returned therefore possibly producing a sub-optimal execution plan.
I agree with the comments regarding using the inner join as opposed to WHERE...IN (SELECT...)
Your statement is incorrect - if you have a reference that says otherwise please provide it.
In this particular case a table variable will likely be best, because you WANT an index seek delete on the actual table(s). You also want to minimize the recompiles that will come from adding/truncating a temp table.
Another suboptimal thing in this article is the non-clustered indexes. They serve no purpose in the queries and impose the significant overhead of building them every loop. Just have a heap table and go about the deletes.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
March 6, 2012 at 8:05 am
TheSQLGuru (3/6/2012)
Another suboptimal thing in this article is the non-clustered indexes. They serve no purpose in the queries and impose the significant overhead of building them every loop. Just have a heap table and go about the deletes.
Or, if the indexes are helpful, cluster the temp table so you're not doubling up.
March 6, 2012 at 8:26 am
Your statement is incorrect - if you have a reference that says otherwise please provide it.
In this particular case a table variable will likely be best, because you WANT an index seek delete on the actual table(s). You also want to minimize the recompiles that will come from adding/truncating a temp table.
Not quite sure about references but if you do a quick test of inserting 1000 records into a table variable and do a select from it, the estimated number of rows is always 1.....
_________________________________________________________________________________SQLGeordieWeb:- Jarrin ConsultancyBlog:- www.chrisjarrintaylor.co.ukTwitter:- @SQLGeordie
March 6, 2012 at 8:40 am
can't index a table variable
Viewing 15 posts - 1 through 15 (of 54 total)
You must be logged in to reply to this topic. Login to reply