October 11, 2004 at 10:46 am
I have a large table from which I need to delete a small set of rows everyday. I cannot use truncate. Since delete operation does logging the performance is poor. Looks like I cannot disable logging. Is there any other faster way to delete the rows?
Thanks.
October 11, 2004 at 11:48 am
Take a look at the indexes on the table. It is possible that SQL is scanning the entire table to find your small set of rows to delete and that is what is causing the poor performance. You can use Query Analyser to display an execution plan of your query to see what is really going on.
Aunt Kathi Data Platform MVP
Author of Expert T-SQL Window Functions
Simple-Talk Editor
October 11, 2004 at 1:31 pm
Assuming that the indexes are defined properly, is there a better way to delete the rows?
October 11, 2004 at 2:26 pm
How are you deleting them now? I think you are probably doing something like:
delete from myTable where myColumn = 'somevalue'
If there is an index on myColumn and SQL is using it, then I'm not sure you can get any better than that.
Aunt Kathi Data Platform MVP
Author of Expert T-SQL Window Functions
Simple-Talk Editor
October 11, 2004 at 4:14 pm
The only other thing I can come up with right now is to turn what might be a sequential process into a set based process. If you are generating a few hundred individual DELETE statements it may perform better if you store off the criteria somewhere like a temporary table and then run a DELETE where the values are IN the temp table values. That, of course, is a lot easier if you have a single column where clause. If you have to match several key columns I think you can use a join. It seems like DELETEs with joins got tricky, though, if possible. Anyhow, if you can make it a single set based delete it should require less index scanning.
Something else to consider is that every delete has to perform index maintenance. The more indexes the more work.
Another consideration would be fragmentation. If the table and/or indexes are highly fragmented (DBCC SHOWCONTIG) the less efficiently it will work.
October 11, 2004 at 5:15 pm
Thanks Kathi and Aaron.
October 12, 2004 at 2:45 am
When it comes to WHERE clauses I have found that the different comparison operators have different performances. In order of speed (fastest first)
If you can avoid "not equal" operators then do so.
Using "MyDate BETWEEN @startDate AND @endDate" is more efficient than "WHERE MyDATE >=@startDate and MyDATE<=@endDate"
LIKE is faster than where clauses with LEFT, RIGHT, SUBSTR etc.
October 12, 2004 at 3:27 am
Some agreements and some disagreements
Not equal operators by their nature never uses any indices, they always force a table scan - that's why they should be avoided if possible.
There is no difference with "MyDate BETWEEN @startDate AND @endDate" and "WHERE MyDATE >=@startDate and MyDATE<=@endDate"
BETWEEN is just shorthand for ">= AND <=", and the optimizer will convert WHERE col1 BETWEEN 'x' AND 'y' into WHERE col1 >= 'x' AND col1 <= 'y'
LIKE is an operator - LEFT, RIGHT, SUBSTRING etc are functions. When you use a function on a column that has an index on it, the index can't be used - thus a function causes tablescans instead of index seeks.
/Kenneth
October 12, 2004 at 3:35 am
Kenneth, try using BETWEEN on a column with a clustered index and then comparing the > and < method.
October 12, 2004 at 5:43 am
That's what I did. Both methods produce identical query plans - and both methods also produce the same argument. Even though you write WHERE col1 BETWEEN 'x' AND 'y' , the argument actually executed is WHERE col1 >= 'x' AND col1 <= 'y'
AFAIK, there is no 'real' difference between the two but syntax - between requires fewer keystrokes. Having said that, I'll admit that I've just did some very quick testing on this - there may indeed be the 'odd' circumstances where the two may produce different plans, however I don't know of any such.
/Kenneth
October 12, 2004 at 10:14 am
If you can fashion a delete which makes use of your clustered index and run your deletes against small sets of data perhaps on a single page the impact to the rest of the table will be minimized. Running deletes against larger sets of dispersed records will have significant impacts on table access and performance.
I had an opportunity to implement a very tight clustered index including ‘CODE’, day,month,year fields this reduced record sets to a single page and was successful in deleting records with almost no impact on the table.
Good luck
dmb
October 12, 2004 at 1:22 pm
Thanks to all.
Viewing 12 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply