Could I eliminate this cursor?

Question

Post reply

Could I eliminate this cursor?

bluepaperbags

SSCommitted

Points: 1738
More actions
March 28, 2008 at 6:13 pm

#120612

I have this stored procedure and I like to eliminate the cursor if possible. Down at the bottom is what I have so far. Not sure if it's right.
ALTER PROCEDURE dbo.sp_UPDATE_ALPHA_INDEX_PARTY
@start_date datetime,
@end_date datetime
AS
set nocount on
DECLARE @instrument_id decimal(28, 0)
DECLARE @party_id decimal(28, 0)
DECLARE @party_data varchar(600)
DECLARE @last_inst_id decimal(28, 0)
DECLARE @sort_order int
TRUNCATE TABLE ALPHA_INDEX_PARTY_DATA
SET @sort_order = 0
SET @last_inst_id = 0
DECLARE main_Cursor CURSOR STATIC FOR
SELECT I.INSTRUMENT_ID, P.PARTY_ID, '('
+ CASE P.PARTY_TYPE WHEN 1 THEN 'D' WHEN 4 THEN 'I' ELSE 'N' END + ') '
+ P.LAST_NAME + CASE WHEN P.FIRST_NAME IS NOT NULL AND P.FIRST_NAME <> '' THEN ', ' + P.FIRST_NAME ELSE '' END
+ CASE WHEN P.MIDDLE_NAME IS NOT NULL AND P.MIDDLE_NAME <> '' THEN ' ' + P.MIDDLE_NAME ELSE '' END
+ CASE WHEN P.NAME_SUFFIX IS NOT NULL AND P.NAME_SUFFIX <> '' THEN ' ' + P.NAME_SUFFIX ELSE '' END AS PARTY_DATA
FROM INSTRUMENT I JOIN PARTY P ON I.INSTRUMENT_ID = P.INSTRUMENT_ID
WHERE I.RECORDING_DATE BETWEEN @start_date AND @end_date AND I.DELETED_FLAG <> 1
OPEN main_Cursor
FETCH NEXT FROM main_Cursor
INTO @instrument_id, @party_id, @party_data
WHILE @@FETCH_STATUS = 0
BEGIN
IF @instrument_id != @last_inst_id
BEGIN
SET @sort_order = 0
END
WHILE LEN(@party_data) > 0
BEGIN
SET @sort_order = @sort_order + 1
INSERT INTO ALPHA_INDEX_PARTY_DATA (INSTRUMENT_ID, PARTY_ID, SORT_ORDER, PARTY_DATA) VALUES (@instrument_id, @party_id, @sort_order, LEFT(@party_data, 36))
IF LEN(@party_data) > 36
BEGIN
SET @party_data = ' ' + LTRIM(RIGHT(@party_data, LEN(@party_data) - 36))
END
ELSE
BEGIN
SET @party_data = ''
END
END
SET @last_inst_id = @instrument_id
FETCH NEXT FROM main_Cursor
INTO @instrument_id, @party_id, @party_data
END
CLOSE main_Cursor
DEALLOCATE main_Cursor
---------------------------------------------------------------------------------
---------------------------------------------------------------------------------
---------------------------------------------------------------------------------
-- DROP TABLE TEMP1
SET NOCOUNT ON
TRUNCATE TABLE ALPHA_INDEX_PARTY_DATA
SELECT instrument_id
INTO INST_TEMP
FROM INSTRUMENT
WHERE RECORDING_DATE BETWEEN '2007-01-01' and '2007-12-31'
AND DELETED_FLAG <> 1
-- select count(*) from inst_temp
-- select top 1 * from inst_temp
SELECT I.INSTRUMENT_ID,
P.PARTY_ID, '(' + CASE P.PARTY_TYPE WHEN 1 THEN 'D' WHEN 4 THEN 'I' ELSE 'N' END + ') '
+ P.LAST_NAME + CASE WHEN P.FIRST_NAME IS NOT NULL AND P.FIRST_NAME <> '' THEN ', ' + P.FIRST_NAME ELSE '' END
+ CASE WHEN P.MIDDLE_NAME IS NOT NULL AND P.MIDDLE_NAME <> '' THEN ' ' + P.MIDDLE_NAME ELSE '' END
+ CASE WHEN P.NAME_SUFFIX IS NOT NULL AND P.NAME_SUFFIX <> '' THEN ' ' + P.NAME_SUFFIX ELSE '' END AS PARTY_DATA
INTO TEMP1
FROM INST_TEMP I
JOIN PARTY P
ON I.INSTRUMENT_ID = P.INSTRUMENT_ID
-- select top 10 * from temp1
/*
select instrument_id, party_id,
sort_order, len(party_data) as Length, party_data
from GG_ATRECORDING_QAT..ALPHA_INDEX_PARTY_DATA
order by length
*/
INSERT ALPHA_INDEX_PARTY_DATA
SELECT instrument_id, party_id, 1, LEFT(party_data, 36)
FROM TEMP1
WHERE LEN(PARTY_DATA) > 0
INSERT ALPHA_INDEX_PARTY_DATA
SELECT instrument_id, party_id, 2, ' ' + LTRIM(RIGHT(party_data, LEN(party_data) - 36)),
RIGHT(party_data, LEN(party_data) - 36), LEN(party_data) - 36, PARTY_DATA, LEN(PARTY_DATA)
FROM TEMP1
WHERE LEN(PARTY_DATA) > 36

Viewing 15 posts - 1 through 15 (of 27 total)

You must be logged in to reply to this topic. Login to reply

Cory E. SSCertifiable Points: 7076 More actions · Answer 1

I could VERY well be wrong on this, as I am just starting to work with CTE's, but something like this may work...

WITH TEST_CTE_Insert ( INSTRUMENT_ID, PARTY_ID, PARTY_DATA )

AS ( SELECT I.INSTRUMENT_ID, P.PARTY_ID, '('

+ CASE P.PARTY_TYPE WHEN 1 THEN 'D' WHEN 4 THEN 'I' ELSE 'N' END + ') '

+ P.LAST_NAME + CASE WHEN P.FIRST_NAME IS NOT NULL AND P.FIRST_NAME <> '' THEN ', ' + P.FIRST_NAME ELSE '' END

+ CASE WHEN P.MIDDLE_NAME IS NOT NULL AND P.MIDDLE_NAME <> '' THEN ' ' + P.MIDDLE_NAME ELSE '' END

+ CASE WHEN P.NAME_SUFFIX IS NOT NULL AND P.NAME_SUFFIX <> '' THEN ' ' + P.NAME_SUFFIX ELSE '' END AS PARTY_DATA

FROM INSTRUMENT I JOIN PARTY P ON I.INSTRUMENT_ID = P.INSTRUMENT_ID

WHERE I.RECORDING_DATE BETWEEN @start_date AND @end_date AND I.DELETED_FLAG <> 1

)

INSERT INTO ALPHA_INDEX_PARTY_DATA

(INSTRUMENT_ID, PARTY_ID, SORT_ORDER, PARTY_DATA)

VALUES (@instrument_id, @party_id, @sort_order, LEFT(@party_data, 36))

-- Cory

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 2

Cory Ellingson (3/28/2008)
I could VERY well be wrong on this, as I am just starting to work with CTE's, but something like this may work...
WITH TEST_CTE_Insert ( INSTRUMENT_ID, PARTY_ID, PARTY_DATA )
AS ( SELECT I.INSTRUMENT_ID, P.PARTY_ID, '('
+ CASE P.PARTY_TYPE WHEN 1 THEN 'D' WHEN 4 THEN 'I' ELSE 'N' END + ') '
+ P.LAST_NAME + CASE WHEN P.FIRST_NAME IS NOT NULL AND P.FIRST_NAME <> '' THEN ', ' + P.FIRST_NAME ELSE '' END
+ CASE WHEN P.MIDDLE_NAME IS NOT NULL AND P.MIDDLE_NAME <> '' THEN ' ' + P.MIDDLE_NAME ELSE '' END
+ CASE WHEN P.NAME_SUFFIX IS NOT NULL AND P.NAME_SUFFIX <> '' THEN ' ' + P.NAME_SUFFIX ELSE '' END AS PARTY_DATA
FROM INSTRUMENT I JOIN PARTY P ON I.INSTRUMENT_ID = P.INSTRUMENT_ID
WHERE I.RECORDING_DATE BETWEEN @start_date AND @end_date AND I.DELETED_FLAG <> 1
)
INSERT INTO ALPHA_INDEX_PARTY_DATA
(INSTRUMENT_ID, PARTY_ID, SORT_ORDER, PARTY_DATA)
VALUES (@instrument_id, @party_id, @sort_order, LEFT(@party_data, 36))

Mighty close, Cory... the hard part isn't the cursor... the hard part is in that inner While loop where he splits the rows he just assembled... If it's the first split, he uses 36 characters. If the Party_Name is more than 36 characters, the first row uses 36 characters and all the following rows use 5 spaces followed by 31 characters AND creating a sequence number. Easy enough to do with the cursor... set-based needs to beat it for speed... I'm thinking this is a job for my favorite tool, the Tally table... 😉

It would be nice if the OP would post some data like I recommend in the URL in my signature line, though (hint, hint) :w00t:

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Cory E. SSCertifiable Points: 7076 More actions · Answer 3

WOOHOO!! I got a "Mighty Close" by Mr Jeff M himself! Thanks!

As far as the post - When I first read the post, I admit, I did not read it all, and missed that part about 36 char's, split it. When I did, I sort of gave up, and posted what I had.

I agree - Follow the link in Jeff's signature, you will likely get a better result!

Have a good weekend all!

-- Cory

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 4

Cory Ellingson (3/28/2008)
WOOHOO!! I got a "Mighty Close" by Mr Jeff M himself! Thanks!

Gosh, I'm really sorry Cory... I didn't mean to offend...

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

RBarryYoung SSC Guru Points: 143329 More actions · Answer 5

Jeff Moden (3/28/2008)
... If it's the first split, he uses 36 characters. If the Party_Name is more than 36 characters, the first row uses 36 characters and all the following rows use 5 spaces followed by 31 characters...

Actually, Jeff, because there is an LTRIM nested in the center of that AND there are some spaces in there already, sometimes it will skip an extra space. Duplicating that is the really hard part.

[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung [font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 6

rbarryyoung (3/28/2008)
Jeff Moden (3/28/2008)
... If it's the first split, he uses 36 characters. If the Party_Name is more than 36 characters, the first row uses 36 characters and all the following rows use 5 spaces followed by 31 characters...
Actually, Jeff, because there is an LTRIM nested in the center of that AND there are some spaces in there already, sometimes it will skip an extra space. Duplicating that is the really hard part.

Heh... you know what they say, though...

"The difficult can be done immediately... the impossible takes a little longer" --Army Core of Engineers

In the absence of any real data to test a solution with, here's some code that uses some data from AdventureWorks to solve what I think the problem is. Obviously, the OP will need to do some tweeking to the first cte to make it draw from the real tables and to the final SELECT to change it into an INSERT/SELECT to put things into the real table... but it does solve the problem without declared RBAR and it does have a good amount of hustle to it.

Cory definitely had a good start on this with the idea of using a CTE... and, without ROW_NUMBER in SQL Server 2005, I'd have had to use a temp table to get the SortOrder... as usual, the details are in the comments in the code...

--===== Identify the database to work in (where there's some canned test data

-- that's close to what the OP has

USE AdventureWorks

--===== This uses two cte's and a "Magic Tally Table" query to solve the problem

;WITH

cteNotACursor AS

(--==== Assemble the data kinda like what the OP had but without so many CASE statements

SELECT ROW_NUMBER() OVER (ORDER BY pc.ContactID) AS RowNum,

pc.ContactID,

'(' + CASE pc.EMailPromotion WHEN 1 THEN 'D' WHEN 2 THEN 'I' ELSE 'N' END + ') '

+ pc.LastName + CASE WHEN NULLIF(pc.FirstName,'') IS NOT NULL THEN ', ' ELSE '' END

+ ISNULL(NULLIF(pc.FirstName+' ',''),'')

+ ISNULL(NULLIF(pc.MiddleName+' ',''),'')

+ ISNULL(NULLIF(pc.Suffix+' ',''),'')

+ 'Just some stuff to make the PartyData longer than 36 characters so we can see the split works as in the original code.'

AS PartyData

FROM Person.Contact pc

)

,

cteTally AS

(--==== Replacement for a real Tally table.

SELECT Number AS N

FROM Master.dbo.spt_Values

WHERE Type = 'P'

AND Number BETWEEN 1 AND 600

)

--===== Using the 2 cte's and a bit of Tally table magic to do the splits, solve the problem

SELECT nac.ContactID,

ROW_NUMBER() OVER (PARTITION BY nac.RowNum ORDER BY nac.RowNum,t.N) AS SortOrder,

CASE

WHEN t.N>=37

THEN ' ' + LTRIM(SUBSTRING(PartyData,t.N,31))

ELSE LTRIM(SUBSTRING(PartyData,t.N,36))

END AS Party_Data

FROM cteNotACursor nac,

cteTally t

WHERE t.N <= LEN(nac.PartyData) --This just limits the range of "N" for a bit of additional speed

AND (t.N IN (1,37) --This finds the starting position of the first two splits

OR (t.N > 37 AND (t.N-37)%31 = 0)) --This finds the starting position for all the rest

Now, THAT was fun! 😛

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

RBarryYoung SSC Guru Points: 143329 More actions · Answer 7

OK, here is my entry. I went the pure CTE route, using some of the stuff that Cory started:

ALTER PROCEDURE dbo.sp_UPDATE_ALPHA_INDEX_PARTY

@start_date datetime,

@end_date datetime

AS

set nocount on

TRUNCATE TABLE ALPHA_INDEX_PARTY_DATA

;WITH SORT_CTE_Insert ( InstrumentID, PartyID, SortOrder, PartyData, PartyData_Remainder ) AS

( SELECT I.INSTRUMENT_ID

, P.PARTY_ID

, 0

, ''

, '('

+ CASE P.PARTY_TYPE WHEN 1 THEN 'D' WHEN 4 THEN 'I' ELSE 'N' END + ') '

+ P.LAST_NAME + CASE WHEN P.FIRST_NAME IS NOT NULL AND P.FIRST_NAME <> '' THEN ', ' + P.FIRST_NAME ELSE '' END

+ CASE WHEN P.MIDDLE_NAME IS NOT NULL AND P.MIDDLE_NAME <> '' THEN ' ' + P.MIDDLE_NAME ELSE '' END

+ CASE WHEN P.NAME_SUFFIX IS NOT NULL AND P.NAME_SUFFIX <> '' THEN ' ' + P.NAME_SUFFIX ELSE '' END AS PARTY_DATA

FROM INSTRUMENT I JOIN PARTY P ON I.INSTRUMENT_ID = P.INSTRUMENT_ID

WHERE I.RECORDING_DATE BETWEEN @start_date AND @end_date AND I.DELETED_FLAG <> 1

--this should fix the (IF @instrument_id != @last_inst_id) business

And P.Party_ID = (Select MIN(P2.Party_ID) From Party P2 Where I.INSTRUMENT_ID = P2.INSTRUMENT_ID)

UNION ALL

Select S.InstrumentID

, S.PartyID

, S.SortOrder + 1

, Left(S.PartyData_Remainder, 36)

, Case When Len(S.partyData_Remainder) > 36

THEN ' ' + LTRIM(RIGHT(S.partyData_Remainder, LEN(S.partyData_Remainder) - 36))

ELSE '' END

From SORT_CTE_Insert

Where S.PartyData_Remainder > ''

)

INSERT INTO ALPHA_INDEX_PARTY_DATA (INSTRUMENT_ID, PARTY_ID, SORT_ORDER, PARTY_DATA)

Select InstrumentID, PartyID, SortOrder, PartyData From SORT_CTE_Insert

Where SortOrder > 0

[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung [font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]

RBarryYoung SSC Guru Points: 143329 More actions · Answer 8

Jeff Moden (3/28/2008)
Now, THAT was fun! 😛

I have to agree!

[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung [font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]

RBarryYoung SSC Guru Points: 143329 More actions · Answer 9

Jeff:

Here is my Adventureworks version. It seems to be faster than the Tally Table approach, but I am not sure if they are returning equivalent datasets:

;WITH SORT_CTE_Insert ( PartyID, SortOrder, PartyData, PartyData_Remainder ) AS

( SELECT --I.INSTRUMENT_ID ,

P.ContactID -- P.PARTY_ID

, 0

, Cast('' as varchar(600))

, Cast('('

+ CASE P.EMailPromotion WHEN 1 THEN 'D' WHEN 4 THEN 'I' ELSE 'N' END + ') '

+ P.LastName + CASE WHEN P.FirstName IS NOT NULL AND P.FirstName <> '' THEN ', ' + P.FirstName ELSE '' END

+ CASE WHEN P.MiddleName IS NOT NULL AND P.MiddleName <> '' THEN ' ' + P.MiddleName ELSE '' END

+ CASE WHEN P.SUFFIX IS NOT NULL AND P.SUFFIX <> '' THEN ' ' + P.SUFFIX ELSE '' END as varchar(600))

From Person.Contact P

/* FROM INSTRUMENT I JOIN PARTY P ON I.INSTRUMENT_ID = P.INSTRUMENT_ID

WHERE I.RECORDING_DATE BETWEEN @start_date AND @end_date AND I.DELETED_FLAG <> 1

--this should fix the (IF @instrument_id != @last_inst_id) business

And P.Party_ID = (Select MIN(P2.Party_ID) From Party P2 Where I.INSTRUMENT_ID = P2.INSTRUMENT_ID)

*/

UNION ALL

Select --S.InstrumentID ,

S.PartyID

, S.SortOrder + 1

, Cast(Left(S.PartyData_Remainder, 36) as varchar(600))

, Cast(Case When Len(S.partyData_Remainder) > 36

THEN ' ' + LTRIM(RIGHT(S.partyData_Remainder, LEN(S.partyData_Remainder) - 36))

ELSE '' END as varchar(600) )

From SORT_CTE_Insert S

Where S.PartyData_Remainder > ''

)

--INSERT INTO ALPHA_INDEX_PARTY_DATA (INSTRUMENT_ID, PARTY_ID, SORT_ORDER, PARTY_DATA)

--INSERT INTO ALPHA_INDEX_PARTY_DATA (PARTY_ID, SORT_ORDER, PARTY_DATA)

Select PartyID, SortOrder, PartyData From SORT_CTE_Insert

Where SortOrder > 0

[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung [font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]

Cory E. SSCertifiable Points: 7076 More actions · Answer 10

Jeff Moden (3/28/2008)
Gosh, I'm really sorry Cory... I didn't mean to offend...

Jeff,

I am sorry, I re-read that this morning - I did not mean to sound offended, I am truely honored that I got a kudos from you. I have respected the work you have done here for some time, and it was your RBAR word that got me re-thinking my ways.

Cory

-- Cory

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 11

Cory Ellingson (3/29/2008)
I am sorry, I re-read that this morning - I did not mean to sound offended, I am truely honored that I got a kudos from you. I have respected the work you have done here for some time, and it was your RBAR word that got me re-thinking my ways.

Dang... I hate email/forum posts... I can't always get the "voice inflection" down... I've never been very good at that. Thank you so much for the feedback... I was really worried about offending a good person such as yourself. And, thanks for the awesome compliments Cory.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 12

rbarryyoung (3/29/2008)
Here is my Adventureworks version. It seems to be faster than the Tally Table approach, but I am not sure if they are returning equivalent datasets:

Thanks Barry... I was wondering what the performance of the recursion would be compared to the Tally table... I'll add "mind reader" to your list of talents. 😀

As you said, there is, in fact, a difference but not in the "return"... it's in the inputs. I had added an extra 118 characters to test the over 36 character code and it looks like the following...

+ 'Just some stuff to make the PartyData longer than 36 characters so we can see the split works as in the original code.'

... which made it look pretty bad for the Tally table.

But, if we even up the data between the two runs, the Tally table wins by a fair margin every time. I did 6 runs of each and averaged the run results both with the extra 118 characters and without. Here's the results...

Run Type CPU ms Elapsed Scans per Table Logical Reads Per Table

---------------- ------ ------- --------------- -----------------------

Recursion w/118 6473 11067 2/1 1,016,805 / 558

Tally w/118 1946 6105 1/1/1 68,314 / 558 / 7

Recursion wo/118 1838 2914 2/1 202,527 / 558

Tally w0/118 932 1793 1/1/1 44,766 / 558 / 7

I also think that individual SELECTs are easier to troubleshoot than recursion pairs of SELECTs, but that's just a personal preference. I haven't liked recursion since I saw my first recursive UDF. 😀

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

RBarryYoung SSC Guru Points: 143329 More actions · Answer 13

Jeff:

Yeah, I do not usually like the recursive CTE's either (they make my brain hurt!) but I was afraid that a Numbers/Tally table approach would be invoking a triangular scan on the party_data strings to do their parsing. Of course I know about calculating the offsets in multiples, but that doesn't work if you want to do that extra little space-skip that it does sometime: you have to serialize it somewhere. I haven't gone all the way through yours yet, so I am not yet sure how you did it (plus, I never learned the Sql2005 "OVER" functions too well, so I have to look them up every time).

By the way, what do you get for the Query Plans? Mine is really strange, it shows the recursive versions with a subtree of 0.4837..., and the Tally version with a subtree of 564.48!

[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung [font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 14

Heh... yeah... I don't like recursion for the same reason you do... I end up sucking my thumb and twiddling my hair when I'm trying to troubleshoot one that doesn't work properly 😛

So far as the execution plan goes... recursion has the exact same problem there that correlated subqueries, triangular joins, Cursors, and While loops have... they only show an analysis of the first execution of the "loop". The Tally table method uses a join, so it shows all the expense. I've had that battle with many, many "DBA's" and "Expert" developers who put all their faith in the comparison of execution plans... I even quit one job because the CTO, the Lead DBA, and the Lead Developer couldn't give up evaluating code performance by comparison of the "% of batch" readings... the just didn't get it.

One of my favorite code snippets to try to teach people that the execution plan means almost nothing insofar as performance predictions go is the following...

SET NOCOUNT ON

--=============================================================================

-- Recursive CTE does the count

--=============================================================================

PRINT '========== Recursive CTE =========='

SET STATISTICS IO ON

SET STATISTICS TIME ON

DECLARE @Top INT

SET @Top = 100000

;WITH cteTally

AS (

SELECT 1 AS N

UNION ALL

SELECT N+1 FROM cteTally WHERE N<@Top

)

SELECT N

INTO #Test1

FROM cteTally

OPTION (MAXRECURSION 0)

SET STATISTICS TIME OFF

SET STATISTICS IO OFF

PRINT REPLICATE('=',80)

GO

--=============================================================================

-- ROW_NUMBER CTE does the count

--=============================================================================

PRINT '========== ROW_NUMBER CTE =========='

SET STATISTICS IO ON

SET STATISTICS TIME ON

DECLARE @Top INT

SET @Top = 100000

;WITH cteTally

AS (

SELECT TOP(@Top) ROW_NUMBER() OVER(ORDER BY sc1.Type) AS N

FROM Master.dbo.spt_Values sc1,

Master.dbo.spt_Values sc2

)

SELECT *

INTO #Test2

FROM cteTally

SET STATISTICS TIME OFF

SET STATISTICS IO OFF

PRINT REPLICATE('=',80)

GO

DROP TABLE #Test1,#Test2

... But when you look at the actual performance...

========== Recursive CTE ==========

SQL Server Execution Times:

CPU time = 0 ms, elapsed time = 1 ms.

Table 'Worktable'. Scan count 2, logical reads 600001, physical reads 0,

read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:

CPU time = 3718 ms, elapsed time = 4691 ms.

================================================================================

========== ROW_NUMBER CTE ==========

SQL Server Execution Times:

CPU time = 0 ms, elapsed time = 1 ms.

Table 'spt_values'. Scan count 2, logical reads 18, physical reads 0,

read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:

CPU time = 125 ms, elapsed time = 128 ms.================================================================================

... you see that the first query actually takes about 97% of the batch time. If you go back and look at the execution plans, the line widths on the plan for the recursive CTE are very very skinny... that indicates a major performance problem (RBAR :hehe: ) and shows that the 37% of the batch is based on just two rows (just like a bloody cursor!) while the execution plan for the second query is based on all the rows.

Heh... being a bit nasty, recursion sucks and lies at the same time.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)