November 9, 2011 at 1:27 am
Friends,
I am working on a reporting query and it has 5 CTEs build inside it. When we talk about performance as a whole, should I work on individual CTEs to find what they do?
Does SQL Server calculates all CTEs first then use them further or do they look at the overall procedure.
For example. Lets say a CTE named as cost1_CTE has a where clause at 'startdate'
then later in the procedure, another CTE uses called cost2_CTE make a join with cost1_CTE on another column. Now the question is, does SQL Server calculates both CTEs individually first or is it smart enough to find and calculate results on all the filters at once.
Kindly provide me with some directions.
Thanks
Chandan Jha
November 9, 2011 at 2:43 am
A normal (non-recursive) CTE is just a logical definition, like an (unindexed) view. The results from a CTE are not materialized; the optimizer considers the query as a whole (with the CTEs expanded like views are).
November 9, 2011 at 3:31 am
SQL Kiwi (11/9/2011)
A normal (non-recursive) CTE is just a logical definition, like an (unindexed) view. The results from a CTE are not materialized; the optimizer considers the query as a whole (with the CTEs expanded like views are).
Thanks Paul. I am noticing that the operation on this procedure is very slow even when the number of io operations are not too high after proper indexing. How to know where is the bottleneck here. I apologise because this is a question where several books are written upon, but in this case, I am not seeing any 'worktables' so need to find where exactly is this happening.
Regards
chandan
November 9, 2011 at 11:52 am
You can do a few things ...
set statistics io on;
This option will allow you to see where the IO's are happening. It can help in many cases to troubleshoot the query, the indexes or both.
But the more important thing would be to get comfortable looking at estimated and actual execution plans. That is where you will get tons of information about where the query is bogging down. Many times cte's are used incorrectly in such a way as to cause the underlying query within the cte to be executed many times. This is the kind of thing you would be able to spot in an execution plan.
November 11, 2011 at 7:03 am
Thanks BT and Paul. Just an update for anyone who is interested.
My original CTE query was taking 5 minutes to run.
then i added an index for a table which was causing 77 million+ reads, it brought down the time to 50 seconds.
Then I got rid of all the CTEs and simply created temp tables on the fly and joined them with same business logics as CTEs and now the execution time is 15 seconds.
I am not sure why this happened? Anyone has any advice for me?
Regards
chandan
November 11, 2011 at 7:50 am
chandan_jha18 (11/11/2011)
Then I got rid of all the CTEs and simply created temp tables on the fly and joined them with same business logics as CTEs and now the execution time is 15 seconds.
Could be one of a number of reasons. Materializing intermediate results in a temporary table won't always be faster than writing the query with a CTE or subquery, but it can be.
It's a trade-off: the downside is the cost of creating the table, writing the rows, and creating/maintaining any indexes. The upside is the temporary table can have indexes and statistics, and can sometimes be optimized in ways CTEs and subqueries currently cannot be. One example of that is where your query references the CTE more than once.
A detailed explanation would require a look at the slow and fast actual execution plans, but I hope the general information above is useful. The important lesson is not to regard one technique as automatically superior to the other; they both have strengths and weaknesses.
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply