Linking to the Previous Row

Question

Linking to the Previous Row

Viewing 15 posts - 46 through 60 (of 147 total)

You must be logged in to reply to this topic. Login to reply

David McKinney SSChampion Points: 10472 More actions · Answer 1

Aaron Gonzalez (1/23/2009)
Déjà vu...I did add this article to my briefcase last year...
Anyway...nice article, David.

yeah....friday is repeats day!

mahk_josephson Grasshopper Points: 14 More actions · Answer 2

This is a well written, concise article with a very practical application. I've read a bit about CTEs but this is the first time I've seen an example that I have already had to solve the old fashioned way.

peter-757102 SSCertifiable Points: 6877 More actions · Answer 3

A CTE is a temporary result set, which is valid for the scope of a single Select Update Insert or Delete statement.

BEWARE

This claim is false, and proof is included. A CTE is instead incorporated into the SQL statement BEFORE the statement is evaluated, planned for and finally executed. Look at it as an inline view instead of an intermediate result set from an execution point of view.

When you execute:

set statistics io on;

select * from dbo.PriceCompare;

you get the following statistics:

Table 'PriceHistory'. Scan count 23, logical reads 46, physical reads 0, read-ahead reads 0,

lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'Items'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0,

lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Now lets create two views of our own just to demonstrate you get identical results:

create view PriceCompareFakeCTE as

select

i.Item

, ph.ItemId

, ph.PriceStartDate

, ph.Price

, row_number() over ( partition by ph.ItemId order by ph.PriceStartDate ) as rownum

from

Items i

inner join PriceHistory as ph on ph.ItemId = i.ItemId

;

go

create view dbo.PriceCompareUsingView as

select

currow.Item as Item

, prevrow.Price as OldPrice

, currow.Price as RangePrice

, currow.PriceStartDate as StartDate

, nextrow.PriceStartDate as EndDate

from

PriceCompareFakeCTE as currow

left join PriceCompareFakeCTE as prevrow on

prevrow.ItemId = currow.ItemId and

prevrow.rownum = currow.rownum - 1

left join PriceCompareFakeCTE as nextrow on

nextrow.ItemId = currow.ItemId and

nextrow.rownum = currow.rownum + 1

;

go

and verify using:

set statistics io on;

select * from dbo.PriceCompareUsingView;

you get the following identical statistics:

Table 'PriceHistory'. Scan count 23, logical reads 46, physical reads 0, read-ahead reads 0,

lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'Items'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0,

lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

The query plans as shown in management studio also display the same structure. Neither view or CTE guarantees the same data is not just read a 2nd time if the optimizer deems that to be more efficient. The read data is already in the cache and storing the intermediate result would take up writes to tempdb.

It can hurt performance and break code when you assume a CTE guarantees the result is build only once.

Now the proof

Here the view with build in CTE from the original article, but now with a newid() function in it that is displayed for the previous, current and next table aliases used in the query.

alter view dbo.PriceCompare as

with

PriceCompare as

(

select

i.Item

, ph.ItemId

, ph.PriceStartDate

, ph.Price

, row_number() over (partition by ph.ItemId order by ph.PriceStartDate) as rownum

, newid() as random

from

Items as i

inner join PriceHistory as ph on ph.ItemId = i.ItemId)

select

currow.Item as Item

, prevrow.Price as OldPrice

, currow.Price as RangePrice

, currow.PriceStartDate as StartDate

, nextrow.PriceStartDate as EndDate

, currow.random as curRandom

, prevrow.random as prevRandom

, nextrow.random as nextRandom

from

PriceCompare as currow

left join PriceCompare as prevrow on

prevrow.ItemId = currow.ItemId and

prevrow.rownum = currow.rownum - 1

left join PriceCompare as nextrow on

nextrow.ItemId = currow.ItemId and

nextrow.rownum = currow.rownum + 1

;

go

with the above modified view in place, execute:

select * from dbo.PriceCompare;

Now compare the prevRandom and nextRandom columns in your result to that of the curRandom from the actual previous and next record.

They are different each and every time :), which would not have been the case if the CTE was an intermediate result that is queried multiple times by the joins in the query!

Request to author

Please correct the article as people will assume your claim to be true and code accordingly in the future.

peter-757102 SSCertifiable Points: 6877 More actions · Answer 4

Mazharuddin Ehsan (3/16/2008)
I hope to show in this article that the CTEs usefulness is not purely cosmetic.
You are right david. It would be immensely useful in more complicated situations. I would have to do as follows without the use of CTE in the present example.
CREATE VIEW [dbo].[PriceCompare] AS
SELECT
currow.Item,
prevrow.Price AS OldPrice,
currow.Price AS RangePrice,
currow.PriceStartDate AS StartDate,
nextrow.PriceStartDate AS EndDate
FROM
(SELECT i.Item, ph.ItemId, ph.PriceStartDate, ph.Price,
ROW_NUMBER() OVER (Partition BY ph.ItemId ORDER BY PriceStartDate) AS rownum
FROM Items i INNER JOIN PriceHistory ph
ON i.ItemId = ph.ItemId) currow
LEFT JOIN
(SELECT i.Item, ph.ItemId, ph.PriceStartDate, ph.Price,
ROW_NUMBER() OVER (Partition BY ph.ItemId ORDER BY PriceStartDate) AS rownum
FROM Items i INNER JOIN PriceHistory ph
ON i.ItemId = ph.ItemId) nextrow
ON currow.rownum = nextrow.rownum - 1 AND currow.ItemId = nextrow.ItemId
LEFT JOIN
(SELECT i.Item, ph.ItemId, ph.PriceStartDate, ph.Price,
ROW_NUMBER() OVER (Partition BY ph.ItemId ORDER BY PriceStartDate) AS rownum
FROM Items i INNER JOIN PriceHistory ph
ON i.ItemId = ph.ItemId) prevrow
ON currow.rownum = prevrow.rownum + 1 AND currow.ItemId = prevrow.ItemId
Too clumsy and reiterative as you see. I am looking forward to your input for 'using CTEs with recursive queries'
Best regards,
Maz

My previous post is proof that what you written is exactly as SQL Server sees it after it fuses the CTE with the main query. Yes it is difficult to spot at a glance the inlined queries are identical, and that is the only reason a CTE is better.

David McKinney SSChampion Points: 10472 More actions · Answer 5

A CTE is a temporary result set, which is valid for the scope of a single Select Update Insert or Delete statement.

A CTE is instead incorporated into the SQL statement BEFORE the statement is evaluated, planned for and finally executed. Look at it as an inline view instead of an intermediate result set from a execution point of view.

I get your distinction, which in some cases may be an important one. I haven't looked in detail at your example (as it's already Friday afternoon here and I'm winding down!), but I suspect the resultset is regenerated precisely because of the newid() function. Otherwise it would reuse the data.

But I'd say it's beyond the scope of this article....but not the subsequent discussion! What I mean by that is that the primary goal of the article was to introduce the reader into practical applications for CTEs, by means of a real life example. This can mean simplifying slightly...like when your English teacher says "Never start a sentence with a conjunction." or a colleague might say "Never use cursors". It's good advice until the day when it isn't.

Perhaps I should have phrased it like Books online, who cleverly uses "thought of" as a veracity disclaimer.

"A common table expression (CTE) can be thought of as a temporary result set that is defined within the execution scope of a single SELECT, INSERT, UPDATE, DELETE, or CREATE VIEW statement. " (From BOL)

So I'll leave the article as it is, but thank you for pointing out that this phrase isn't strictly true!

Regards,

David

Bill Coale-478581 SSC Enthusiast Points: 154 More actions · Answer 6

slange (1/23/2009)
Bill Coale (3/13/2008)
Also truncate to midnight the start/end date and any date comparisons to the range
I handle the end date solution a little differently....

Yes, and I'm looking forward to migrating to SQLServer 2008, with the new data type for just the date with no time.

peter-757102 SSCertifiable Points: 6877 More actions · Answer 7

I understand your point of view as well as the theory you put forward as to why it might be different in case of my proof. Therefore I did the query once more on my modified view to see if the statistics and query plan where any different from the original. They were not and that is 100% proof that the original also generates no intermediate results.

The reasons I feel it is important to change:

1. Half the discussion goes about using CTEs and you confirmed it is one of the key ingredients that you build your case upon and is in fact a focus.

2. The article is very easy to correct without negative effects.

If I am right and I think I am, people new to CTE get put on the wrong track from the start. When they think a CTE is a means to get an intermediate result they will code with that in mind. Also, this forum is about learning and sharing knowledge, so it is important the information is as correct as possible. Therefore I hope you will reconsidder after taking a look at the code I posted. There is a case to be made to keep the article as is and add a correction at the bottom or top. This way the following discussion keeps making sense for new readers.

Other then that, enjoy your weekend....I sure will 🙂

bteague SSC Eights! Points: 895 More actions · Answer 8

Nice article. Do you have any recommendation regarding CTEs vs. temporary tables in 2005 or 2008?

onipif SSC Enthusiast Points: 103 More actions · Answer 9

Hello everybody,

I have wrote about a problem that I have and I thinks who will be on the context of this article.

I appreciate your help, thanks 😉 :

http://www.sqlservercentral.com/Forums/Topic642999-362-1.aspx

David McKinney SSChampion Points: 10472 More actions · Answer 10

bteague (1/23/2009)
Nice article. Do you have any recommendation regarding CTEs vs. temporary tables in 2005 or 2008?

2008 I haven't used yet...so can't say.

I think you've got to situate CTEs somewhere between temporary tables and subqueries (a select in a select).

A few points below in no particular order.

1) The most obvious advantage over both a subquery and a temp table is that CTEs are more readable.

2) You can put a CTE in a view (like a subquery but unlike a temp table)

3) You can reference a CTE multiple times in your statement (like a temp table but unlike a subquery.)

4) CTEs are in scope only for the current statement (like a subquery but unlike a temp table.)

One really handy thing about CTEs (which I don't think I mention in the article) is the ability to use multiple CTEs for the same select, and also for one CTE to reference another. Thus you can achieve something really quite complex - where doing the same thing with temp tables or subquery would be an absolute nightmare.

Another point is that with CTEs you don't explicitly define data types, as you usually would with a temp table (unless you "select into".)

As Peter pointed out a few posts ago, a CTE is more akin to a subquery than a temp table, in that it is constructed when it is referenced and not upfront.

As for performance specifically of CTEs vs temp tables, I don't have sufficient knowledge to comment - so I won't! (I know that the rownumber() function isn't the worlds fastest - but that's in no way specific to a CTE.)

Hope this answers at least partially your question.

I'm currently working on a SQL2000 project and I really, really miss my CTEs, so my final word would be one of caution - CTEs are addictive, withdrawal isn't pretty, and there's no decent rehab!

Regards,

David.

Misha_SQL SSCertifiable Points: 5407 More actions · Answer 11

Absolutely loved the article. I am sure everyone had to do something like this at one time or another. Thank you for sharing this technique and nice explanations.

Jeff Moden SSC Guru Points: 1004407 More actions · Answer 12

David McKinney (1/25/2009)
As Peter pointed out a few posts ago, a CTE is more akin to a subquery than a temp table, in that it is constructed when it is referenced and not upfront.
As for performance specifically of CTEs vs temp tables, I don't have sufficient knowledge to comment - so I won't! (I know that the rownumber() function isn't the worlds fastest - but that's in no way specific to a CTE.)

I'll say, be careful... CTE's are (currently) re-executed everytime you call them. Temp Tables only suffer the single execution to build them. Further, large Temp Tables can be indexed the way that you need them... CTE's cannot.

Last, but not least, a CTE is good for a single query. A Temp Table's life span is for the whole proc that made it.

No, I'm not saying to not use CTE's or to use Temp Tables instead. I'm saying, "It Depends". 😉

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Dave R-429118 Newbie Points: 1 More actions · Answer 13

Good article. I tried to use it to get results of calculations for previous year and quarter but not working.

Here is the scenario: I want to calculate CPrice, and PPrice

and display as for Yr = 2002, Qtr = 3, CPrice = (Price/Prod), PPrice = (CPrice of Yr-2002, Qtr-2)

Yr, Qtr, Price, Prod, CPrice, PPrice

2002, 2, 120, 100, 1.2, 0

2002, 3, 150, 50, 3, 1.2

and so on.....

How can I use your example "Linking to the Previous Row"?

David McKinney SSChampion Points: 10472 More actions · Answer 14

Dave R (1/30/2009)
Good article. I tried to use it to get results of calculations for previous year and quarter but not working.
How can I use your example "Linking to the Previous Row"?

The key is that, after the numbering is applied, the rows you want to compare / link are sequential i.e. you want to link row n to row n-1 or n+1. So if you have got one row per quarter, then you should order by year, quarter. If you've got multiple rows per quarter then you either have to filter with a where clause or partition. I can't tell you which - it depends on your situation.

I suggest you adopt a step by step approach as in the example, and try to get all relevant data appearing on the same row.

(It is possible that you don't even need to number - that you already have a numbering system with your year and quarter - but maybe it's easiest to apply a row_number anyway.)

Hope this helps,

Let me know how you get on,

David.

Paul Paiva SSCrazy Points: 2237 More actions · Answer 15

Excellent and clearly written!

I'm always reading CTE articles and this is one of the best, because it explains a situation which is very commonly encountered.

- Paul

http://paulpaivasql.blogspot.com/