Could I eliminate this cursor?

  • I have this stored procedure and I like to eliminate the cursor if possible. Down at the bottom is what I have so far. Not sure if it's right.

    ALTER PROCEDURE dbo.sp_UPDATE_ALPHA_INDEX_PARTY

    @start_date datetime,

    @end_date datetime

    AS

    set nocount on

    DECLARE @instrument_id decimal(28, 0)

    DECLARE @party_id decimal(28, 0)

    DECLARE @party_data varchar(600)

    DECLARE @last_inst_id decimal(28, 0)

    DECLARE @sort_order int

    TRUNCATE TABLE ALPHA_INDEX_PARTY_DATA

    SET @sort_order = 0

    SET @last_inst_id = 0

    DECLARE main_Cursor CURSOR STATIC FOR

    SELECT I.INSTRUMENT_ID, P.PARTY_ID, '('

    + CASE P.PARTY_TYPE WHEN 1 THEN 'D' WHEN 4 THEN 'I' ELSE 'N' END + ') '

    + P.LAST_NAME + CASE WHEN P.FIRST_NAME IS NOT NULL AND P.FIRST_NAME <> '' THEN ', ' + P.FIRST_NAME ELSE '' END

    + CASE WHEN P.MIDDLE_NAME IS NOT NULL AND P.MIDDLE_NAME <> '' THEN ' ' + P.MIDDLE_NAME ELSE '' END

    + CASE WHEN P.NAME_SUFFIX IS NOT NULL AND P.NAME_SUFFIX <> '' THEN ' ' + P.NAME_SUFFIX ELSE '' END AS PARTY_DATA

    FROM INSTRUMENT I JOIN PARTY P ON I.INSTRUMENT_ID = P.INSTRUMENT_ID

    WHERE I.RECORDING_DATE BETWEEN @start_date AND @end_date AND I.DELETED_FLAG <> 1

    OPEN main_Cursor

    FETCH NEXT FROM main_Cursor

    INTO @instrument_id, @party_id, @party_data

    WHILE @@FETCH_STATUS = 0

    BEGIN

    IF @instrument_id != @last_inst_id

    BEGIN

    SET @sort_order = 0

    END

    WHILE LEN(@party_data) > 0

    BEGIN

    SET @sort_order = @sort_order + 1

    INSERT INTO ALPHA_INDEX_PARTY_DATA (INSTRUMENT_ID, PARTY_ID, SORT_ORDER, PARTY_DATA) VALUES (@instrument_id, @party_id, @sort_order, LEFT(@party_data, 36))

    IF LEN(@party_data) > 36

    BEGIN

    SET @party_data = ' ' + LTRIM(RIGHT(@party_data, LEN(@party_data) - 36))

    END

    ELSE

    BEGIN

    SET @party_data = ''

    END

    END

    SET @last_inst_id = @instrument_id

    FETCH NEXT FROM main_Cursor

    INTO @instrument_id, @party_id, @party_data

    END

    CLOSE main_Cursor

    DEALLOCATE main_Cursor

    ---------------------------------------------------------------------------------

    ---------------------------------------------------------------------------------

    ---------------------------------------------------------------------------------

    -- DROP TABLE TEMP1

    SET NOCOUNT ON

    TRUNCATE TABLE ALPHA_INDEX_PARTY_DATA

    SELECT instrument_id

    INTO INST_TEMP

    FROM INSTRUMENT

    WHERE RECORDING_DATE BETWEEN '2007-01-01' and '2007-12-31'

    AND DELETED_FLAG <> 1

    -- select count(*) from inst_temp

    -- select top 1 * from inst_temp

    SELECT I.INSTRUMENT_ID,

    P.PARTY_ID, '(' + CASE P.PARTY_TYPE WHEN 1 THEN 'D' WHEN 4 THEN 'I' ELSE 'N' END + ') '

    + P.LAST_NAME + CASE WHEN P.FIRST_NAME IS NOT NULL AND P.FIRST_NAME <> '' THEN ', ' + P.FIRST_NAME ELSE '' END

    + CASE WHEN P.MIDDLE_NAME IS NOT NULL AND P.MIDDLE_NAME <> '' THEN ' ' + P.MIDDLE_NAME ELSE '' END

    + CASE WHEN P.NAME_SUFFIX IS NOT NULL AND P.NAME_SUFFIX <> '' THEN ' ' + P.NAME_SUFFIX ELSE '' END AS PARTY_DATA

    INTO TEMP1

    FROM INST_TEMP I

    JOIN PARTY P

    ON I.INSTRUMENT_ID = P.INSTRUMENT_ID

    -- select top 10 * from temp1

    /*

    select instrument_id, party_id,

    sort_order, len(party_data) as Length, party_data

    from GG_ATRECORDING_QAT..ALPHA_INDEX_PARTY_DATA

    order by length

    */

    INSERT ALPHA_INDEX_PARTY_DATA

    SELECT instrument_id, party_id, 1, LEFT(party_data, 36)

    FROM TEMP1

    WHERE LEN(PARTY_DATA) > 0

    INSERT ALPHA_INDEX_PARTY_DATA

    SELECT instrument_id, party_id, 2, ' ' + LTRIM(RIGHT(party_data, LEN(party_data) - 36)),

    RIGHT(party_data, LEN(party_data) - 36), LEN(party_data) - 36, PARTY_DATA, LEN(PARTY_DATA)

    FROM TEMP1

    WHERE LEN(PARTY_DATA) > 36

  • I could VERY well be wrong on this, as I am just starting to work with CTE's, but something like this may work...

    WITH TEST_CTE_Insert ( INSTRUMENT_ID, PARTY_ID, PARTY_DATA )

    AS ( SELECT I.INSTRUMENT_ID, P.PARTY_ID, '('

    + CASE P.PARTY_TYPE WHEN 1 THEN 'D' WHEN 4 THEN 'I' ELSE 'N' END + ') '

    + P.LAST_NAME + CASE WHEN P.FIRST_NAME IS NOT NULL AND P.FIRST_NAME <> '' THEN ', ' + P.FIRST_NAME ELSE '' END

    + CASE WHEN P.MIDDLE_NAME IS NOT NULL AND P.MIDDLE_NAME <> '' THEN ' ' + P.MIDDLE_NAME ELSE '' END

    + CASE WHEN P.NAME_SUFFIX IS NOT NULL AND P.NAME_SUFFIX <> '' THEN ' ' + P.NAME_SUFFIX ELSE '' END AS PARTY_DATA

    FROM INSTRUMENT I JOIN PARTY P ON I.INSTRUMENT_ID = P.INSTRUMENT_ID

    WHERE I.RECORDING_DATE BETWEEN @start_date AND @end_date AND I.DELETED_FLAG <> 1

    )

    INSERT INTO ALPHA_INDEX_PARTY_DATA

    (INSTRUMENT_ID, PARTY_ID, SORT_ORDER, PARTY_DATA)

    VALUES (@instrument_id, @party_id, @sort_order, LEFT(@party_data, 36))

    -- Cory

  • Cory Ellingson (3/28/2008)


    I could VERY well be wrong on this, as I am just starting to work with CTE's, but something like this may work...

    WITH TEST_CTE_Insert ( INSTRUMENT_ID, PARTY_ID, PARTY_DATA )

    AS ( SELECT I.INSTRUMENT_ID, P.PARTY_ID, '('

    + CASE P.PARTY_TYPE WHEN 1 THEN 'D' WHEN 4 THEN 'I' ELSE 'N' END + ') '

    + P.LAST_NAME + CASE WHEN P.FIRST_NAME IS NOT NULL AND P.FIRST_NAME <> '' THEN ', ' + P.FIRST_NAME ELSE '' END

    + CASE WHEN P.MIDDLE_NAME IS NOT NULL AND P.MIDDLE_NAME <> '' THEN ' ' + P.MIDDLE_NAME ELSE '' END

    + CASE WHEN P.NAME_SUFFIX IS NOT NULL AND P.NAME_SUFFIX <> '' THEN ' ' + P.NAME_SUFFIX ELSE '' END AS PARTY_DATA

    FROM INSTRUMENT I JOIN PARTY P ON I.INSTRUMENT_ID = P.INSTRUMENT_ID

    WHERE I.RECORDING_DATE BETWEEN @start_date AND @end_date AND I.DELETED_FLAG <> 1

    )

    INSERT INTO ALPHA_INDEX_PARTY_DATA

    (INSTRUMENT_ID, PARTY_ID, SORT_ORDER, PARTY_DATA)

    VALUES (@instrument_id, @party_id, @sort_order, LEFT(@party_data, 36))

    Mighty close, Cory... the hard part isn't the cursor... the hard part is in that inner While loop where he splits the rows he just assembled... If it's the first split, he uses 36 characters. If the Party_Name is more than 36 characters, the first row uses 36 characters and all the following rows use 5 spaces followed by 31 characters AND creating a sequence number. Easy enough to do with the cursor... set-based needs to beat it for speed... I'm thinking this is a job for my favorite tool, the Tally table... 😉

    It would be nice if the OP would post some data like I recommend in the URL in my signature line, though (hint, hint) :w00t:

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • WOOHOO!! I got a "Mighty Close" by Mr Jeff M himself! Thanks!

    As far as the post - When I first read the post, I admit, I did not read it all, and missed that part about 36 char's, split it. When I did, I sort of gave up, and posted what I had.

    I agree - Follow the link in Jeff's signature, you will likely get a better result!

    Have a good weekend all!

    -- Cory

  • Cory Ellingson (3/28/2008)


    WOOHOO!! I got a "Mighty Close" by Mr Jeff M himself! Thanks!

    Gosh, I'm really sorry Cory... I didn't mean to offend...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (3/28/2008)


    ... If it's the first split, he uses 36 characters. If the Party_Name is more than 36 characters, the first row uses 36 characters and all the following rows use 5 spaces followed by 31 characters...

    Actually, Jeff, because there is an LTRIM nested in the center of that AND there are some spaces in there already, sometimes it will skip an extra space. Duplicating that is the really hard part.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • rbarryyoung (3/28/2008)


    Jeff Moden (3/28/2008)


    ... If it's the first split, he uses 36 characters. If the Party_Name is more than 36 characters, the first row uses 36 characters and all the following rows use 5 spaces followed by 31 characters...

    Actually, Jeff, because there is an LTRIM nested in the center of that AND there are some spaces in there already, sometimes it will skip an extra space. Duplicating that is the really hard part.

    Heh... you know what they say, though...

    "The difficult can be done immediately... the impossible takes a little longer" --Army Core of Engineers

    In the absence of any real data to test a solution with, here's some code that uses some data from AdventureWorks to solve what I think the problem is. Obviously, the OP will need to do some tweeking to the first cte to make it draw from the real tables and to the final SELECT to change it into an INSERT/SELECT to put things into the real table... but it does solve the problem without declared RBAR and it does have a good amount of hustle to it.

    Cory definitely had a good start on this with the idea of using a CTE... and, without ROW_NUMBER in SQL Server 2005, I'd have had to use a temp table to get the SortOrder... as usual, the details are in the comments in the code...

    --===== Identify the database to work in (where there's some canned test data

    -- that's close to what the OP has

    USE AdventureWorks

    --===== This uses two cte's and a "Magic Tally Table" query to solve the problem

    ;WITH

    cteNotACursor AS

    (--==== Assemble the data kinda like what the OP had but without so many CASE statements

    SELECT ROW_NUMBER() OVER (ORDER BY pc.ContactID) AS RowNum,

    pc.ContactID,

    '(' + CASE pc.EMailPromotion WHEN 1 THEN 'D' WHEN 2 THEN 'I' ELSE 'N' END + ') '

    + pc.LastName + CASE WHEN NULLIF(pc.FirstName,'') IS NOT NULL THEN ', ' ELSE '' END

    + ISNULL(NULLIF(pc.FirstName+' ',''),'')

    + ISNULL(NULLIF(pc.MiddleName+' ',''),'')

    + ISNULL(NULLIF(pc.Suffix+' ',''),'')

    + 'Just some stuff to make the PartyData longer than 36 characters so we can see the split works as in the original code.'

    AS PartyData

    FROM Person.Contact pc

    )

    ,

    cteTally AS

    (--==== Replacement for a real Tally table.

    SELECT Number AS N

    FROM Master.dbo.spt_Values

    WHERE Type = 'P'

    AND Number BETWEEN 1 AND 600

    )

    --===== Using the 2 cte's and a bit of Tally table magic to do the splits, solve the problem

    SELECT nac.ContactID,

    ROW_NUMBER() OVER (PARTITION BY nac.RowNum ORDER BY nac.RowNum,t.N) AS SortOrder,

    CASE

    WHEN t.N>=37

    THEN ' ' + LTRIM(SUBSTRING(PartyData,t.N,31))

    ELSE LTRIM(SUBSTRING(PartyData,t.N,36))

    END AS Party_Data

    FROM cteNotACursor nac,

    cteTally t

    WHERE t.N <= LEN(nac.PartyData) --This just limits the range of "N" for a bit of additional speed

    AND (t.N IN (1,37) --This finds the starting position of the first two splits

    OR (t.N > 37 AND (t.N-37)%31 = 0)) --This finds the starting position for all the rest

    Now, THAT was fun! 😛

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • OK, here is my entry. I went the pure CTE route, using some of the stuff that Cory started:

    ALTER PROCEDURE dbo.sp_UPDATE_ALPHA_INDEX_PARTY

    @start_date datetime,

    @end_date datetime

    AS

    set nocount on

    TRUNCATE TABLE ALPHA_INDEX_PARTY_DATA

    ;WITH SORT_CTE_Insert ( InstrumentID, PartyID, SortOrder, PartyData, PartyData_Remainder ) AS

    ( SELECT I.INSTRUMENT_ID

    , P.PARTY_ID

    , 0

    , ''

    , '('

    + CASE P.PARTY_TYPE WHEN 1 THEN 'D' WHEN 4 THEN 'I' ELSE 'N' END + ') '

    + P.LAST_NAME + CASE WHEN P.FIRST_NAME IS NOT NULL AND P.FIRST_NAME <> '' THEN ', ' + P.FIRST_NAME ELSE '' END

    + CASE WHEN P.MIDDLE_NAME IS NOT NULL AND P.MIDDLE_NAME <> '' THEN ' ' + P.MIDDLE_NAME ELSE '' END

    + CASE WHEN P.NAME_SUFFIX IS NOT NULL AND P.NAME_SUFFIX <> '' THEN ' ' + P.NAME_SUFFIX ELSE '' END AS PARTY_DATA

    FROM INSTRUMENT I JOIN PARTY P ON I.INSTRUMENT_ID = P.INSTRUMENT_ID

    WHERE I.RECORDING_DATE BETWEEN @start_date AND @end_date AND I.DELETED_FLAG <> 1

    --this should fix the (IF @instrument_id != @last_inst_id) business

    And P.Party_ID = (Select MIN(P2.Party_ID) From Party P2 Where I.INSTRUMENT_ID = P2.INSTRUMENT_ID)

    UNION ALL

    Select S.InstrumentID

    , S.PartyID

    , S.SortOrder + 1

    , Left(S.PartyData_Remainder, 36)

    , Case When Len(S.partyData_Remainder) > 36

    THEN ' ' + LTRIM(RIGHT(S.partyData_Remainder, LEN(S.partyData_Remainder) - 36))

    ELSE '' END

    From SORT_CTE_Insert

    Where S.PartyData_Remainder > ''

    )

    INSERT INTO ALPHA_INDEX_PARTY_DATA (INSTRUMENT_ID, PARTY_ID, SORT_ORDER, PARTY_DATA)

    Select InstrumentID, PartyID, SortOrder, PartyData From SORT_CTE_Insert

    Where SortOrder > 0

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Jeff Moden (3/28/2008)


    Now, THAT was fun! 😛

    I have to agree!

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Jeff:

    Here is my Adventureworks version. It seems to be faster than the Tally Table approach, but I am not sure if they are returning equivalent datasets:

    ;WITH SORT_CTE_Insert ( PartyID, SortOrder, PartyData, PartyData_Remainder ) AS

    ( SELECT --I.INSTRUMENT_ID ,

    P.ContactID -- P.PARTY_ID

    , 0

    , Cast('' as varchar(600))

    , Cast('('

    + CASE P.EMailPromotion WHEN 1 THEN 'D' WHEN 4 THEN 'I' ELSE 'N' END + ') '

    + P.LastName + CASE WHEN P.FirstName IS NOT NULL AND P.FirstName <> '' THEN ', ' + P.FirstName ELSE '' END

    + CASE WHEN P.MiddleName IS NOT NULL AND P.MiddleName <> '' THEN ' ' + P.MiddleName ELSE '' END

    + CASE WHEN P.SUFFIX IS NOT NULL AND P.SUFFIX <> '' THEN ' ' + P.SUFFIX ELSE '' END as varchar(600))

    From Person.Contact P

    /* FROM INSTRUMENT I JOIN PARTY P ON I.INSTRUMENT_ID = P.INSTRUMENT_ID

    WHERE I.RECORDING_DATE BETWEEN @start_date AND @end_date AND I.DELETED_FLAG <> 1

    --this should fix the (IF @instrument_id != @last_inst_id) business

    And P.Party_ID = (Select MIN(P2.Party_ID) From Party P2 Where I.INSTRUMENT_ID = P2.INSTRUMENT_ID)

    */

    UNION ALL

    Select --S.InstrumentID ,

    S.PartyID

    , S.SortOrder + 1

    , Cast(Left(S.PartyData_Remainder, 36) as varchar(600))

    , Cast(Case When Len(S.partyData_Remainder) > 36

    THEN ' ' + LTRIM(RIGHT(S.partyData_Remainder, LEN(S.partyData_Remainder) - 36))

    ELSE '' END as varchar(600) )

    From SORT_CTE_Insert S

    Where S.PartyData_Remainder > ''

    )

    --INSERT INTO ALPHA_INDEX_PARTY_DATA (INSTRUMENT_ID, PARTY_ID, SORT_ORDER, PARTY_DATA)

    --INSERT INTO ALPHA_INDEX_PARTY_DATA (PARTY_ID, SORT_ORDER, PARTY_DATA)

    Select PartyID, SortOrder, PartyData From SORT_CTE_Insert

    Where SortOrder > 0

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Jeff Moden (3/28/2008)

    Gosh, I'm really sorry Cory... I didn't mean to offend...

    Jeff,

    I am sorry, I re-read that this morning - I did not mean to sound offended, I am truely honored that I got a kudos from you. I have respected the work you have done here for some time, and it was your RBAR word that got me re-thinking my ways.

    Cory

    -- Cory

  • Cory Ellingson (3/29/2008)


    I am sorry, I re-read that this morning - I did not mean to sound offended, I am truely honored that I got a kudos from you. I have respected the work you have done here for some time, and it was your RBAR word that got me re-thinking my ways.

    Dang... I hate email/forum posts... I can't always get the "voice inflection" down... I've never been very good at that. Thank you so much for the feedback... I was really worried about offending a good person such as yourself. And, thanks for the awesome compliments Cory.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • rbarryyoung (3/29/2008)


    Here is my Adventureworks version. It seems to be faster than the Tally Table approach, but I am not sure if they are returning equivalent datasets:

    Thanks Barry... I was wondering what the performance of the recursion would be compared to the Tally table... I'll add "mind reader" to your list of talents. 😀

    As you said, there is, in fact, a difference but not in the "return"... it's in the inputs. I had added an extra 118 characters to test the over 36 character code and it looks like the following...

    + 'Just some stuff to make the PartyData longer than 36 characters so we can see the split works as in the original code.'

    ... which made it look pretty bad for the Tally table.

    But, if we even up the data between the two runs, the Tally table wins by a fair margin every time. I did 6 runs of each and averaged the run results both with the extra 118 characters and without. Here's the results...

    Run Type CPU ms Elapsed Scans per Table Logical Reads Per Table

    ---------------- ------ ------- --------------- -----------------------

    Recursion w/118 6473 11067 2/1 1,016,805 / 558

    Tally w/118 1946 6105 1/1/1 68,314 / 558 / 7

    Recursion wo/118 1838 2914 2/1 202,527 / 558

    Tally w0/118 932 1793 1/1/1 44,766 / 558 / 7

    I also think that individual SELECTs are easier to troubleshoot than recursion pairs of SELECTs, but that's just a personal preference. I haven't liked recursion since I saw my first recursive UDF. 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff:

    Yeah, I do not usually like the recursive CTE's either (they make my brain hurt!) but I was afraid that a Numbers/Tally table approach would be invoking a triangular scan on the party_data strings to do their parsing. Of course I know about calculating the offsets in multiples, but that doesn't work if you want to do that extra little space-skip that it does sometime: you have to serialize it somewhere. I haven't gone all the way through yours yet, so I am not yet sure how you did it (plus, I never learned the Sql2005 "OVER" functions too well, so I have to look them up every time).

    By the way, what do you get for the Query Plans? Mine is really strange, it shows the recursive versions with a subtree of 0.4837..., and the Tally version with a subtree of 564.48!

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Heh... yeah... I don't like recursion for the same reason you do... I end up sucking my thumb and twiddling my hair when I'm trying to troubleshoot one that doesn't work properly 😛

    So far as the execution plan goes... recursion has the exact same problem there that correlated subqueries, triangular joins, Cursors, and While loops have... they only show an analysis of the first execution of the "loop". The Tally table method uses a join, so it shows all the expense. I've had that battle with many, many "DBA's" and "Expert" developers who put all their faith in the comparison of execution plans... I even quit one job because the CTO, the Lead DBA, and the Lead Developer couldn't give up evaluating code performance by comparison of the "% of batch" readings... the just didn't get it.

    One of my favorite code snippets to try to teach people that the execution plan means almost nothing insofar as performance predictions go is the following...

    SET NOCOUNT ON

    --=============================================================================

    -- Recursive CTE does the count

    --=============================================================================

    PRINT '========== Recursive CTE =========='

    SET STATISTICS IO ON

    SET STATISTICS TIME ON

    DECLARE @Top INT

    SET @Top = 100000

    ;WITH cteTally

    AS (

    SELECT 1 AS N

    UNION ALL

    SELECT N+1 FROM cteTally WHERE N<@Top

    )

    SELECT N

    INTO #Test1

    FROM cteTally

    OPTION (MAXRECURSION 0)

    SET STATISTICS TIME OFF

    SET STATISTICS IO OFF

    PRINT REPLICATE('=',80)

    GO

    --=============================================================================

    -- ROW_NUMBER CTE does the count

    --=============================================================================

    PRINT '========== ROW_NUMBER CTE =========='

    SET STATISTICS IO ON

    SET STATISTICS TIME ON

    DECLARE @Top INT

    SET @Top = 100000

    ;WITH cteTally

    AS (

    SELECT TOP(@Top) ROW_NUMBER() OVER(ORDER BY sc1.Type) AS N

    FROM Master.dbo.spt_Values sc1,

    Master.dbo.spt_Values sc2

    )

    SELECT *

    INTO #Test2

    FROM cteTally

    SET STATISTICS TIME OFF

    SET STATISTICS IO OFF

    PRINT REPLICATE('=',80)

    GO

    DROP TABLE #Test1,#Test2

    ... But when you look at the actual performance...

    ========== Recursive CTE ==========

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    Table 'Worktable'. Scan count 2, logical reads 600001, physical reads 0,

    read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 3718 ms, elapsed time = 4691 ms.

    ================================================================================

    ========== ROW_NUMBER CTE ==========

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    Table 'spt_values'. Scan count 2, logical reads 18, physical reads 0,

    read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 125 ms, elapsed time = 128 ms.================================================================================

    ... you see that the first query actually takes about 97% of the batch time. If you go back and look at the execution plans, the line widths on the plan for the recursive CTE are very very skinny... that indicates a major performance problem (RBAR :hehe: ) and shows that the 37% of the batch is based on just two rows (just like a bloody cursor!) while the execution plan for the second query is based on all the rows.

    Heh... being a bit nasty, recursion sucks and lies at the same time.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 26 total)

You must be logged in to reply to this topic. Login to reply