Recursive CTE: Is it possible to count records without an aggregate function?

Question

Recursive CTE: Is it possible to count records without an aggregate function?

Viewing 8 posts - 31 through 38 (of 38 total)

You must be logged in to reply to this topic. Login to reply

ChrisM@home SSC-Insane Points: 24260 More actions · Answer 1

sgmunson (2/3/2010)
Why? That's not the path this is going to take... The level is only going to be relative to the root level, not to any other value. The hierarchy shown only has 3 levels - the root, the 1st level and the 2nd level. It traversed the hierarchy quite correctly, from my perspective. Also, the root level was designated as level 0, so you have a 0-based setup here. The 3rd level is thus level 2 instead of 3. If I'm somehow confused here, please point out where and why... I'm always eager to learn...
Steve
(aka sgmunson)
:-):-):-)
Chris Morris-439714 (2/3/2010)
The lvl value should be 3 where the parentid value is 3. The windowing function worked fine, but the lvl increment in the recursive part of the CTE didn't.

Heh you know what? You're exactly right! My apologies Steve. I was so bent on finding something wrong that I invented it.

The query works. Rows with parentID 2 & 3 are hierarchy level 2, just as you say, and just as the query outputs.

I was running this against a db set to compat level 100, SQL Server 2008. Will post up service pack info tomorrow.

[font="Arial"]^{Low-hanging fruit picker and defender of the moggies}[/font]

For better assistance in answering your questions, please read this[/url].

Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

Paul White SSC Guru Points: 150468 More actions · Answer 2

Mark-101232 (2/3/2010)
Might be of interest
https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=318084

Also https://connect.microsoft.com/SQLServer/feedback/details/496271/row-number-does-not-number-properly-intermideate-cte-result-set

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

sgmunson SSC Guru Points: 110651 More actions · Answer 3

Interesting... I tried the "simple repro" query that Steve Kass submitted and sure enough, row_number() does not get properly evaluated - it's as if it gets a fresh evaluation from scratch for each row in the iteration. I also agree with Vladimir that "no allowing ROW_NUMBER or other ranking functions in CTE is like going back in time ..."

Here's my most recent attempt at solving the balls/boxes problem recursively, but I know it shouldn't work correctly because the query following EXCEPT should ideally be one that takes every box id from BOX_MATCHES and pairs it with all possible ball ids, and then every ball id from BOX_MATCHES, pairing with all possible box ids. If I could create such a query without more than one reference to BOX_MATCHES, the problem would be solved.

;WITH BOX_MATCHES AS (

SELECT BoxId, MIN(BallId) AS BallId --, MIN(PreferenceId) AS PreferenceId

FROM TC22_Preferences

GROUP BY BoxId

HAVING COUNT(DISTINCT BallId) = 1

UNION ALL

SELECT BoxId, BallId --, PreferenceId

FROM (

SELECT BoxId, BallId, /*PreferenceId,*/ COUNT(TESTER) OVER (PARTITION BY BallId) AS COUNTER

FROM (

SELECT BoxId, BallId, 1 AS TESTER

FROM TC22_Preferences

EXCEPT (

SELECT BoxId, BallId, 1 AS TESTER

FROM BOX_MATCHES

)

) X

) BM

WHERE BM.COUNTER = 1

)

SELECT BOX.BoxName AS Box, BALL.BallName As Ball

FROM BOX_MATCHES BM INNER JOIN TC22_Boxes BOX ON BM.BoxId = BOX.BoxId

INNER JOIN TC22_Balls BALL ON BM.BallId = BALL.BallId

--ORDER BY BM.BoxId

OPTION (MAXRECURSION 0)

See the attachment for the xml execution plan.

Steve

(aka sgmunson)

:-):-):-)

Paul White (2/3/2010)
Mark-101232 (2/3/2010)
Might be of interest
https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=318084
Also https://connect.microsoft.com/SQLServer/feedback/details/496271/row-number-does-not-number-properly-intermideate-cte-result-set

Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 4

Paul White (2/3/2010)
Mark-101232 (2/3/2010)
Might be of interest
https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=318084
Also https://connect.microsoft.com/SQLServer/feedback/details/496271/row-number-does-not-number-properly-intermideate-cte-result-set

Thanks for posting this, Paul, it's made for a very interesting lunchtime.

If you add a third anchor row, like this...

with T(i, rn, lvl) as (

select 0, cast(0 as bigint), 0 -- anchor 1

union all

select 1, 0, 0 -- anchor 2

union all

select 2, 0, 0 -- anchor 3

union all

select i+2, row_number() over (order by NEWID()), lvl+1

from T

where i<10

)

select i, rn, lvl from T

order by i;

... then it makes sense.

Here's the return:

Rowirnlvl

1000

2100

3200

4211

5311

6411

7412

8512

9612

10613

11713

12813

13814

14914

151014

161015

171115

The rows are explained as follows:

Row 1 Anchor row 1

Row 2 Anchor row 2

Row 3 Anchor row 3

Row 4 Anchor row 1, first recursion: single-row source (Row 1) and product

Row 5 Anchor row 2, first recursion: single-row source (Row 2) and product

Row 6 Anchor row 3, first recursion: single-row source (Row 3) and product

Row 7 Anchor row 1 second recursion: single-row source (first recursion, Row 4) and product

Row 8 Anchor row 2 second recursion: single-row source (first recursion, Row 5) and product

Row 9 Anchor row 3 second recursion: single-row source (first recursion, Row 6) and product

Row 10 Anchor row 1 third recursion: single-row source (second recursion, Row 7)

Row 11 Anchor row 2 third recursion: single-row source (second recursion, Row 8)

Row 12 Anchor row 3 third recursion: single-row source (second recursion, Row 9)

and so on.

T as feed to each recursion (I think of it as the "recursion buffer", someone must know what Microsoft call it) only ever contains one row in this query, so ROW_NUMBER() will always return 1. The recursion buffer only contains the results of the last recursion for that anchor, not an accumulation of all recursions.

Cheers

ChrisM

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 5

sgmunson (2/4/2010)
row_number() does not get properly evaluated - it's as if it gets a fresh evaluation from scratch for each row in the iteration

But what if it's correctly "evaluating" only one row?

Steve, I've PM'd you my solution to the puzzle rather than posting it here, don't want to get into trouble again :blush:

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

sgmunson SSC Guru Points: 110651 More actions · Answer 6

Perhaps I misunderstand how this is supposed to work... I would have thought that the objective of allowing recursion is precisely to gather a series of records, and that each iteration through the recursion, by virtue of it's recursive reference, sees the entire set of data so far. If that's not the case, then I can see why there's a problem. However, what if a given iteration adds two rows to the set? Does that mean the next iteration only sees one of those rows, or is it just that each iteration only sees all the rows from previous one?

Steve

(aka sgmunson)

:-):-):-)

Chris Morris-439714 (2/4/2010)
sgmunson (2/4/2010)
row_number() does not get properly evaluated - it's as if it gets a fresh evaluation from scratch for each row in the iteration
But what if it's correctly "evaluating" only one row?
Steve, I've PM'd you my solution to the puzzle rather than posting it here, don't want to get into trouble again :blush:

Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)

sgmunson SSC Guru Points: 110651 More actions · Answer 7

Just to let everyone know, Chris' solution works quite nicely. The key is in realizing that recursive CTE's recursive references only retrieve the data from the last iteration, and NOT the data from any previous iteration, so the net result of most of my previous attempts was inherently doomed. Let's hope Microsoft sees fit to provide a means to get past that limitation. I also think they need to figure out how to allow the aggregate functions as well as GROUP BY, and stop putting so many limitations on recursive CTEs.

Steve

(aka sgmunson)

:-):-):-)

Chris Morris-439714 (2/4/2010)
sgmunson (2/4/2010)
row_number() does not get properly evaluated - it's as if it gets a fresh evaluation from scratch for each row in the iteration
But what if it's correctly "evaluating" only one row?
Steve, I've PM'd you my solution to the puzzle rather than posting it here, don't want to get into trouble again :blush:

Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 8

Paul White (2/3/2010)
Mark-101232 (2/3/2010)
Might be of interest
https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=318084
Also https://connect.microsoft.com/SQLServer/feedback/details/496271/row-number-does-not-number-properly-intermideate-cte-result-set

Hi Paul

This has turned out to be a real can of worms. I couldn't see anything wrong using the example code and data (earlier post in same thread) but figured it was worth a closer look. The attached code shows a simple non-hierarchical recursive CTE and the functional equivalent using a series of CTE's. The outputs of each are entirely consistent across all columns except where windowing functions have been used to attempt to measure the intermediate results. The recursive CTE correctly measures the first result set as having a rowcount of 2, but subsequent result sets, which should all have a rowcount of 2, appear to be incorrectly measured, returning a rowcount of 1.

Here's the code, results are posted underneath.

--------------------------------------------

-- Make some data to play with

--------------------------------------------

DROP TABLE #Sample

CREATE TABLE #Sample (RowID CHAR(1), Number INT)

INSERT INTO #Sample (RowID, Number)

VALUES ('A',1), ('B',2), ('C',3), ('D',4), ('E',5), ('F',5), ('G',6), ('H',7), ('I',8), ('J',9)

--------------------------------------------

-- A simple CTE recursion, no hierarchy

-- Similar to "Join to the next row"

--------------------------------------------

;WITH MyCTE AS (

SELECT

RowID AS sRowID, CAST('' AS CHAR(1)) AS CTERowID,

Number AS sNumber, 0 AS CTEnumber,

0 AS Recurse,

1 AS BufferRowCount,

CAST(1 AS INT) AS BufferRowID

FROM #Sample WHERE Number = 1

UNION ALL

SELECT s.RowID, cte.sRowID,

s.Number, cte.sNumber,

cte.Recurse+1,

COUNT(cte.sNumber) OVER(PARTITION BY 'X'),

CAST(ROW_NUMBER() OVER (ORDER BY s.RowID, cte.sRowID) AS INT)

FROM #Sample s

INNER JOIN MyCTE cte ON cte.sNumber + 1 = s.Number

) SELECT * FROM MyCTE

ORDER BY sRowID, CTERowID

--------------------------------------------

-- Now model this process using regular CTEs

--------------------------------------------

;WITH

CTE1 as (

SELECT RowID AS sRowID, CAST('' AS CHAR(1)) AS CTERowID,

Number AS sNumber, 0 AS CTEnumber,

0 AS Recurse,

1 AS BufferRowCount,

CAST(1 AS INT) AS BufferRowID

FROM #Sample WHERE Number = 1),

CTE2 as (

SELECT s.RowID AS sRowID, c.sRowID AS CTERowID,

s.Number AS sNumber, c.sNumber AS CTEnumber,

Recurse = c.Recurse+1,

BufferRowCount = COUNT(c.sNumber) OVER(PARTITION BY 'X'),

BufferRowID = CAST(ROW_NUMBER() OVER (ORDER BY s.RowID, c.sRowID) AS INT)

FROM #Sample s

INNER JOIN CTE1 c ON c.sNumber + 1 = s.Number),

CTE3 AS (

SELECT s.RowID AS sRowID, c.sRowID AS CTERowID,

s.Number AS sNumber, c.sNumber AS CTEnumber,

Recurse = c.Recurse+1,

BufferRowCount = COUNT(c.sNumber) OVER(PARTITION BY 'X'),

BufferRowID = CAST(ROW_NUMBER() OVER (ORDER BY s.RowID, c.sRowID) AS INT)

FROM #Sample s

INNER JOIN CTE2 c ON c.sNumber + 1 = s.Number),

CTE4 AS (

SELECT s.RowID AS sRowID, c.sRowID AS CTERowID,

s.Number AS sNumber, c.sNumber AS CTEnumber,

Recurse = c.Recurse+1,

BufferRowCount = COUNT(c.sNumber) OVER(PARTITION BY 'X'),

BufferRowID = CAST(ROW_NUMBER() OVER (ORDER BY s.RowID, c.sRowID) AS INT)

FROM #Sample s

INNER JOIN CTE3 c ON c.sNumber + 1 = s.Number),

CTE5 AS (

SELECT s.RowID AS sRowID, c.sRowID AS CTERowID,

s.Number AS sNumber, c.sNumber AS CTEnumber,

Recurse = c.Recurse+1,

BufferRowCount = COUNT(c.sNumber) OVER(PARTITION BY 'X'),

BufferRowID = CAST(ROW_NUMBER() OVER (ORDER BY s.RowID, c.sRowID) AS INT)

FROM #Sample s

INNER JOIN CTE4 c ON c.sNumber + 1 = s.Number),

CTE6 AS (

SELECT s.RowID AS sRowID, c.sRowID AS CTERowID,

s.Number AS sNumber, c.sNumber AS CTEnumber,

Recurse = c.Recurse+1,

BufferRowCount = COUNT(c.sNumber) OVER(PARTITION BY 'X'),

BufferRowID = CAST(ROW_NUMBER() OVER (ORDER BY s.RowID, c.sRowID) AS INT)

FROM #Sample s

INNER JOIN CTE5 c ON c.sNumber + 1 = s.Number),

CTE7 AS (

SELECT s.RowID AS sRowID, c.sRowID AS CTERowID,

s.Number AS sNumber, c.sNumber AS CTEnumber,

Recurse = c.Recurse+1,

BufferRowCount = COUNT(c.sNumber) OVER(PARTITION BY 'X'),

BufferRowID = CAST(ROW_NUMBER() OVER (ORDER BY s.RowID, c.sRowID) AS INT)

FROM #Sample s

INNER JOIN CTE6 c ON c.sNumber + 1 = s.Number),

CTE8 AS (

SELECT s.RowID AS sRowID, c.sRowID AS CTERowID,

s.Number AS sNumber, c.sNumber AS CTEnumber,

Recurse = c.Recurse+1,

BufferRowCount = COUNT(c.sNumber) OVER(PARTITION BY 'X'),

BufferRowID = CAST(ROW_NUMBER() OVER (ORDER BY s.RowID, c.sRowID) AS INT)

FROM #Sample s

INNER JOIN CTE7 c ON c.sNumber + 1 = s.Number),

CTE9 AS (

SELECT s.RowID AS sRowID, c.sRowID AS CTERowID,

s.Number AS sNumber, c.sNumber AS CTEnumber,

Recurse = c.Recurse+1,

BufferRowCount = COUNT(c.sNumber) OVER(PARTITION BY 'X'),

BufferRowID = CAST(ROW_NUMBER() OVER (ORDER BY s.RowID, c.sRowID) AS INT)

FROM #Sample s

INNER JOIN CTE8 c ON c.sNumber + 1 = s.Number)

SELECT * FROM CTE1

UNION ALL

SELECT * FROM CTE2

UNION ALL

SELECT * FROM CTE3

UNION ALL

SELECT * FROM CTE4

UNION ALL

SELECT * FROM CTE5

UNION ALL

SELECT * FROM CTE6

UNION ALL

SELECT * FROM CTE7

UNION ALL

SELECT * FROM CTE8

UNION ALL

SELECT * FROM CTE9

Results for recursive CTE:

s_RowIDCTE_RowIDs_NumberCTE_numberRecurseBuffer_RowCountBuffer_RowID

A 10011

BA21111

CB32211

DC43311

ED54421

FD54422

GE65511

GF65511

HG76611

IH87711

JI98811

Results from a series of CTE's:

sRowIDCTERowIDsNumberCTEnumberRecurseBufferRowCountBufferRowID

A 10011

BA21111

CB32211

DC43311

ED54421

FD54422

GE65521

GF65522

HG76621

HG76622

IH87721

IH87722

JI98821

JI98822

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden