March 18, 2008 at 11:43 am
I think the biggest benefit for many people is that CTEs allow them to write complex SQL in an easier way. Moving to derived tables just seems harder for many people writing queries.
A CTE can be written first, meaning write part of your SQL that you need, then once you enclose it in the CTE formatting, you add it like any other table or view, without having to create those objects.
However as someone that's been writing T-SQL for over a dozen years, I'm not sure it's that much easier for me. It is slightly clearer in some cases, but for the most part I find relatively few queries where I'd use it. I think that's an experience thing
March 18, 2008 at 12:11 pm
This was a simple, clear, and well written article. The subject matter is a bit basic for someone who is familiar with the new features in 2005, but it is an excellent introduction to CTEs.
---
Timothy A Wiseman
SQL Blog: http://timothyawiseman.wordpress.com/
March 18, 2008 at 12:18 pm
69.5% of the total batch cost for the derived table method according to the execution plan, but only 31.8% of the runtime (according to the getdate() queries I inserted at various steps)... I'm confused. I'd assumed that cost and runtime were directly correlated unless there were competing processes, but it seems to occur consistently, regardless of which query is first in the batch.
Edit: There is a competing load going on, which is probably what threw off how the cost compares to the execution time due to the competition for CPU, RAM and I/O.
March 18, 2008 at 12:44 pm
Does anyone have a good example of using a CTE to rollup assembly costs in a nested Bill of Materials? It seems to me that this should be done from the bottom of the assembly upwards. Most examples I've found deal with rolling up salaries for employees in an organization. The difference with a BOM is that an assembly can be part of multiple assemblies so there isn't just one path in the hierarchy like there is with an employee in an organization.
March 18, 2008 at 1:42 pm
jdoconsulting (3/18/2008)
Does anyone have a good example of using a CTE to rollup assembly costs in a nested Bill of Materials? It seems to me that this should be done from the bottom of the assembly upwards. Most examples I've found deal with rolling up salaries for employees in an organization. The difference with a BOM is that an assembly can be part of multiple assemblies so there isn't just one path in the hierarchy like there is with an employee in an organization.
Not sure how you would do that. If you have 2-inch #10 bolts as one of you assembly parts, how do you roll that up? It might be used in hundreds of different end products.
On BoM hierarchies, I usually roll them down. Start at the final product, or the sub-assembly that you want data for, and go down from there.
If you really want to go from the bottom up, you'd just reverse the join from the Books Online example. Instead of joining the Parent ID in the table to the ID in the CTE, join the ID in the table to the Parent ID in the CTE. That gives you a bottom-up hierarchy.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
March 18, 2008 at 1:48 pm
srienstr (3/18/2008)
69.5% of the total batch cost for the derived table method according to the execution plan, but only 31.8% of the runtime (according to the getdate() queries I inserted at various steps)... I'm confused. I'd assumed that cost and runtime were directly correlated unless there were competing processes, but it seems to occur consistently, regardless of which query is first in the batch.Edit: There is a competing load going on, which is probably what threw off how the cost compares to the execution time due to the competition for CPU, RAM and I/O.
Estimated Cost does not neccessarily equal Percentage of Runtime
It sometimes does, and sometimes is close, but not always. Cost isn't just an estimate of how long something will take. It's also an estimate of how much CPU time it will take, how many I/O cycles, how many mathematical computations, how many string functions, etc. It's meant to be a numeric representation of how much effort the server is going through to accomplish something. If it were meant to correlate directly to time, it would be an estimated execution time field, not an estimated cost.
As an aside, you're better of using "set statistics time on", than adding "select getdate()" clauses to your query. It's more accurate since the select getdate() query can, itself, add time to the runtime of the query.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
March 18, 2008 at 2:46 pm
And then you better make sure you run the code more than once, even on a quiet system...
DECLARE @Year INT
SET @Year = 2008
SET STATISTICS IO ON
SET STATISTICS TIME ON
;WITH cteDates AS
(
SELECT DATEADD(yy,@Year-1900,0)+Number AS TheDate
FROM Master.dbo.spt_Values
WHERE Type = 'P'
AND Number <= 366
)
SELECT cd.TheDate, DATEPART(qq,TheDate) AS TheQuarter
FROM cteDates cd
WHERE YEAR(TheDate) = @Year
SET STATISTICS TIME OFF
PRINT REPLICATE('=',100)
SET STATISTICS TIME ON
SELECT DATEADD(yy,@Year-1900,0)+Number AS TheDate,
DATEPART(qq,DATEADD(yy,@Year-1900,0)+Number) AS TheQuarter
FROM Master.dbo.spt_Values
WHERE Type = 'P'
AND Number < 366
AND YEAR(DATEADD(yy,@Year-1900,0)+Number) = @Year
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
--Jeff Moden
Change is inevitable... Change for the better is not.
March 18, 2008 at 3:03 pm
Jeff Moden (3/18/2008)
And then you better make sure you run the code more than once, even on a quiet system...
I did three runs in one order, and two runs with the order switched, the % cost matched, though the % time did vary slightly. The issue that threw me off was that an I/O intensive load was running at the same time, so the I/O portion of the costs was actually taking much more time relative to the CPU and RAM portions.
Then again, it may just be that the cost formula is not properly balanced for our server for some reason that isn't apparent to me.
March 18, 2008 at 3:15 pm
Did you look at the "Message" window? Sometimes one or the other runs in half the time as the other. Some folks look at something like that on just one run and try to take it to the bank thinking (like writing company SQL standards) that one or the other is better... bad mistake a lot of times.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 19, 2008 at 12:46 pm
A script I posted on another thread in this forum illustrates why I like the layout and readability of CTEs.
The thread is at http://www.sqlservercentral.com/Forums/Topic470379-338-1.aspx
The script is:
;with
CTE1 (ID, Position, Val) as
(select id, numbers.number, substring(reverse(rtrim(clean)), numbers.number, 1)
from dbo.numbercleanb
inner join dbo.numbers
on numbers.number between 1 and len(clean)),
CTE2 (ID, Pos, Val) as
(select id, row_number() over (partition by id order by position), val
from cte1
where val like '[0-9]'),
CTE3 (ID, Number) as
(select id, sum(power(cast(10 as bigint), cast(pos-1 as bigint)) * cast(val as bigint))
from cte2
group by id)
select count(*)
from cte3
inner join dbo.numbercleanb
on cte3.id = numbercleanb.id
and cte3.number != numbercleanb.number
A "derived tables" version might look like:
select count(*)
from
(select id,
sum(power(cast(10 as bigint), cast(pos-1 as bigint)) * cast(val as bigint)) as number
from
(select id, row_number() over (partition by id order by position) as pos, val
from
(select id, numbers.number as position,
substring(reverse(rtrim(clean)), numbers.number, 1) as val
from dbo.numbercleanb
inner join dbo.numbers
on numbers.number between 1 and len(clean)) Sub1
where val like '[0-9]') Sub2
group by id) Sub3
inner join dbo.numbercleanb
on sub3.id = numbercleanb.id
and sub3.number != numbercleanb.number
The reasons I find the CTEs more readable are:
A) I don't have to keep increasing the indention as I increase the number of levels of subquery. This means I'm less likely to have to scroll left and right on the screen.
B) The column names in the derived tables are less visible than those in the CTEs.
C) If I need to debug or modify part of the query, the CTEs are self-contained and can be modified more modularly.
D) Each query, including the final/outer query, is all in one place. In the derived tables version, part of each subquery is above and part below, the inner subs. This is forced because I have to Select, From, Where for each subquery, and From comes before Where. In the CTEs, this isn't neccessary and it keeps each in one place.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
March 19, 2008 at 1:01 pm
GSquared,
Good example of how CTE's can help make some code easier to read and/or maintain.
😎
March 26, 2008 at 1:45 pm
Thanks, Kirk for a well-written explanation of CTEs.
Great discussion, too. Thanks!
Viewing 12 posts - 31 through 41 (of 41 total)
You must be logged in to reply to this topic. Login to reply