December 23, 2010 at 10:56 am
K - thanks for the response. I think you a rebuild of that index will make a substantial improvement.
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
December 23, 2010 at 5:53 pm
Roger Sabin (12/23/2010)
There are no triggers on the table.I did not time the actual deletes but when 10 single deletes finish quickly and a single delete for 10 rows does not, I think it was obvious which one was faster.
I have not had time to rebuild the clustered index and try deleting again. I went with the cursor deletes because I needed to get something done now and it was accomplishing the deletes in a reasonable amount of time.
I think the problem might be a little "Halloweening". You have the following code example...
delete from tblDocument where DocumentID in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
DELETE, like UPDATE, actually has 2 FROM clauses in SQL Server and, if the indexes are just right and you have parallelism and a couple of other conditions happen just right (they WILL be repeatable on the given table on the given machine but may not on another machine), it could be doing a really nasty behind-the-scenes "cursor" that traverses the entire table for each row identified in the query.
Just for grins, try this instead...
DELETE tgt
FROM tblDocument tgt
WHERE tgt.DocumentID IN (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
--Jeff Moden
Change is inevitable... Change for the better is not.
December 24, 2010 at 8:39 am
1) I would like to see the query plans for the 10 row IN delete and also one of the single documentid deletes.
2) are there any foreign keys present? you didn't post all indexes so I wonder if you left out FKs as well.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
December 25, 2010 at 7:32 pm
I might have missed it, but I didn't see a definition for the cdm_data table yet. I've used this for my test:
create table dbo.cdm_data (
ApplicationOrg varchar(3) not null,
ApplicationType varchar(3) not null,
ApplicationNumber varchar(13) null
);
Having the proper table definition would make the discussion easier, but I can do some guesses without it already.
From the looks of your queries, your tblDocument Number column is always filled by 19 character codes: 1st three characters represent the application, 2rd 3 represent the document type and remaining 13 characters represent some document identification. Your queries will probably better benefit from the indexes if you construct the document number to search for from the cdm_data instead of trying to shred the document number into cdm_data components. Then, a second thing that struck me as odd in your queries is why is cdm.ApplicationNumber required to be NULL and at the same time part of the non-null document Number, i.e. non-null? Third point of improvement is the date manipulation: to strip off the time component of a datetime value you'd better use dateadd(day, datediff(day, 0, getdate()), 0) instead of the string conversion. Again this can help the query optimizer in selecting the proper indexes.
delete tblDocument
from tblDocument as d
left outer join cdm_data as cdm
on cdm.ApplicationOrg = left(d.Number, 3)
and cdm.ApplicationType = substring(d.Number, 4, 3)
and cdm.ApplicationNumber = right(d.Number, 13) -- <== ApplicationNumber is non-NULL
where d.Category = 'APPLICATION'
and d.CreatedDate < convert(varchar(10), getdate(), 101)
and cdm.ApplicationNumber is NULL -- <== ApplicationNumber is NULL
Assuming my 19 character document Number assumption is correct I would suggest to do something like below query to make the query on tbldocument.Number sarg-able. I've left off the where ApplicationNumber is NULL clause, I will come back on that later:
delete tblDocument
from tblDocument as d
left outer join cdm_data as cdm on (convert(char(3),cdm.ApplicationOrg) + convert(char(3),cdm.ApplicationType) + convert(char(13),cdm.ApplicationNumber) = d.Number)
where d.Category = 'APPLICATION'
and d.CreatedDate < dateadd(day, datediff(day, 0, getdate()), 0)
--and cdm.ApplicationNumber is NULL
This should perform faster, plus it is also more robust: if a malformed document number is encountered it will not be improperly matched. For example: Number = "abc1234567890123" would match "abc", "123", "1234567890123" in the original query, even though it is actually 3 characters short.
As you've probably seen, the left outer join to cdm_data could just as well be left of at all in the above query. It may even be that the optimizer already eliminates the table from the query. So why was the cdm_data table in the query? My guess is that it was added to check for non-existence in this table. In that case the "and cdm.ApplicationNumber is NULL" must have been added to make the left outer join act as a not-exists construction. We really need the definition for the cdm_data table to determine why it performs badly; my guess is that column ApplicationNumber is not in an index or not the first column in an index. For readability it is better to rewrite the query using "not exists". Left outer join with an is null where clause is only in very specific cases faster, but not exists is always better readable:
delete d
from tblDocument as d
where d.Category = 'APPLICATION'
and d.CreatedDate < dateadd(day, datediff(day, 0, getdate()), 0)
and not exists (
select *
from cdm_data as cdm
where convert(char(3),cdm.ApplicationOrg) + convert(char(3),cdm.ApplicationType) + convert(char(13),cdm.ApplicationNumber) = d.Number
)
Viewing 4 posts - 16 through 18 (of 18 total)
You must be logged in to reply to this topic. Login to reply