October 17, 2011 at 8:05 am
Hi,
I have a scenario which generates a question for me and if it gets answered, it can really help me and many others too. I will write a detailed scenario for a simple thing to explain how do I interpret things at this level.
There is a normal delete query:
Delete from ScheduledTasks_TaskLog
where DateGenerated <= DateAdd(day, -30, GetDate())
I am posting the DDL for the table, its indexes(2 total,1 clustered by default and 1 non clustered) and the current execution plan. I know it can be enhanced but for now, please give me your valuable time to answer the following questions:
1) The first rightmost operator is the clustered index scan. Because the optimiser did not find a suitable index on 'dategenerated' it scanned the whole of the clustred index to find the data required.
Q- Then while moving slowly to the left, I see a 'Top' operator. What function or role does it play here?
2) now, I move to the left side again and find a 'clustered index delete' . My understanding is that in a table with clustered index, data actually lives in the leaf levels so it performs the delete operation here. when I hover my mouse over this operator, in the output list I see 4 columns- TaskID, TaskName,TaskLastRuntime and MachineName.
Q- What does it mean actually? And is it not showing other columns in the table?
3) I moved left and encountered a 'Sort Operator' whose output list contains 4 columns talked about above.
Q- Why does this operator comes into picture and why is it only showing 4 columns. What happens to rest of the table columns.
4) The last operator on the left side is Non-clustered index delete
Q- What function does it perform now since we already deleted data from the clustered index where the actual data resides.
Apologize for simple questions but this will help a lot of people like me who get confused with execution plans.
Regards
Chandan
October 17, 2011 at 11:20 am
Any volunteers to help please!!
October 17, 2011 at 11:46 am
I will answer your last question. After deleting the data from the table, SQL Server also needs to update any non-clustered indexes that reference the data deleted.
October 17, 2011 at 12:39 pm
Thanks Lynn.
Do you know where Gail(Gilamonster), Grant F, Ninja, Jeff Modem, Celco and other great folks are hiding. I need an expert eye because I do not have anyone in my local circle to answer such questions for me.
Regards
Chandan
October 17, 2011 at 12:39 pm
A few of your cases here are quite standard. What we can see, is that SQL Server query optimizer is missing greatly on the estimated number of rows. This can normally be caused by one of two factors: Statistics being out of date, or use of functions in the condition (where clause). My guess is, if you create a variable, assign dateadd(day,-30,getdate()) to it, and use this variable in the where clause, the query optimizer will have a scalar to check against statistics, and do a much greater job. Furhtermore, there is no (nonclustered) index on the DateGenerated column, which could speed up the search for rows to delete, but slow down inserts, updates and deletes.
As for the top operator, I would not be all that worried when I see the cost.
The clustered index delete is the actual deletion of the data. Since you have a nonclustered index referencing this index, the corresponding non clustered index entries must be deleted as well. SQL Server estimates that it will be most efficient to sort the rows to delete by the index key, and look up each entry in the nonclustered index, as opposed to performing a non clustered index scan. Hence the sort of the data.
October 17, 2011 at 12:45 pm
Thank you. I know I need a non-clustered index here to speed up the performance. I am basically not worried for performance as of now. If you read my questions, you will find that my doubts are valid.
For the discussion, I will put the plan once again but for now, just by looking at the simple execution plan, I am waiting for some volunteer to answer the questions if possible.
Regards
Chandan
October 17, 2011 at 12:55 pm
Trying to explain a bit more regarding the index deletes: For the clustered index delete you see the columns used for looking up the keys to delete, and the columns used for looking up in non clusterd indexes afterwards (non clustered key columns). For non clustered indexes, you only see the key columns. The reason for this is fairly simple: Use as little memory and CPU resources as possible. SQL Server does only use the columns it needs for looking up the entries it has to delete.
Please let me know if something (else than the top operator) is unclear. I guess the rest is covered by now.
October 17, 2011 at 1:00 pm
If you look at the top operator, you can see that "Is Row Count" is true. This top operator is nothign more than a rowcount. Thanks to SQL Server 2008 Internals for giving me a hint about this.
Update:
A bit of further information: http://blogs.msdn.com/b/craigfr/archive/2007/07/25/rowcount-top.aspx
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply