Flexible Recursive Level Splitting - One column per level

  • Hi All,

    see code below, using a small sample of data.

    I have a table that is formed like Id/ParentId/Name, but I often need to use it in a ways that would be quicker if I had the data split into columns by level. I can do this in the way shown below, but it strikes me that it is a pretty long winded and inflexible method (mostly because we cannot rule out the possibility of new levels being added). Does anyone have any suggestions about how to achieve the same result in a more flexible (and maybe more concise) way?

    Any help much appreciated!

    Mark

    CREATE TABLE #CoverCauses

    ( IdINTNOT NULL

    ,ParentIdINTNOT NULL

    ,NameVARCHAR(50) NOT NULL)

    CREATE TABLE #MyCauses

    ( IdINT

    ,ParentIdINT

    ,NameVARCHAR(255)

    ,CLevINT);

    INSERT #CoverCauses

    SELECT * FROM

    (SELECT 8639 AS [Id],0 AS [ParentId],'A' AS [Code] UNION

    SELECT 8640,8639,'AMED' UNION

    SELECT 8641,8640,'AMEDILL' UNION

    SELECT 8642,8640,'AMEDINJ' UNION

    SELECT 8643,8642,'AMEDINJARS' UNION

    SELECT 8644,8642,'AMEDINJADR' UNION

    SELECT 8645,8642,'AMEDINJRVE' UNION

    SELECT 8646,8642,'AMEDINJRTA' UNION

    SELECT 8653,8640,'AMEDDEA' UNION

    SELECT 8654,8653,'AMEDDEAILL' UNION

    SELECT 8655,8653,'AMEDDEAARS' UNION

    SELECT 8656,8653,'AMEDDEAADR' UNION

    SELECT 8657,8653,'AMEDDEARVE' UNION

    SELECT 8658,8653,'AMEDDEARTA' UNION

    SELECT 8665,8639,'ACUR' UNION

    SELECT 8666,8665,'ACURINP' UNION

    SELECT 8667,8666,'ACURINPILL' UNION

    SELECT 8668,8666,'ACURINPINJ' UNION

    SELECT 8669,8668,'ACURINPINJARS' UNION

    SELECT 8670,8668,'ACURINPINJADR' UNION

    SELECT 8671,8668,'ACURINPINJRVE' UNION

    SELECT 8672,8668,'ACURINPINJRTA') AS Whatever

    ORDER BY Id;

    With CoverCause AS

    (SELECT Id, ParentId, Name, 1 AS Level

    FROM #CoverCauses

    WHERE ParentId = 0

    UNION ALL

    SELECT DownWeGo.Id, DownWeGo.ParentId, DownWeGo.Name, Level + 1

    FROM #CoverCauses AS DownWeGo

    JOIN CoverCauseON CoverCause.Id = DownWeGo.ParentId)

    INSERT #MyCauses

    SELECT Id, ParentId, Name, Level

    FROM CoverCause

    GO

    /*

    To get each level, repeatedly join to id to parent id to step up the "trail" to the top level, ensuring that we start at the bottom level,

    using inner joins will ensure that we only start on each level if there is a record there.

    */

    SELECT

    Leveled.Id

    ,Leveled.ParentId

    ,Leveled.TopLevel

    ,Leveled.SecondLevel

    ,Leveled.ThirdLevel

    ,Leveled.FourthLevel

    ,Leveled.FifthLevel

    ,Cause.NameAS BottomLevel

    FROM

    (SELECTId,ParentId, Name AS TopLevel, NULL AS SecondLevel, NULL AS ThirdLevel, NULL AS FourthLevel, NULL AS FifthLevel

    FROM #MyCauses

    WHERE CLev = 1

    UNION

    -- SecondLevel

    SELECT Low.Id, Low.ParentId, Up1.Name, Low.Name, NULL, NULL,NULL

    FROM #MyCausesAS Low

    JOIN #MyCausesAS Up1ON Low.ParentId = Up1.Id

    AND Low.Clev = 2

    UNION

    -- ThirdLevel

    SELECT Low.Id, Low.ParentId, Up2.Name, Up1.Name, Low.Name, NULL, NULL

    FROM #MyCausesAS Low

    JOIN #MyCausesAS Up1ON Low.ParentId = Up1.Id

    AND Low.CLev =3

    JOIN #MyCausesAS Up2ON Up1.ParentId = Up2.Id

    UNION

    --FourthLevel

    SELECT Low.Id, Low.ParentId, Up3.Name, Up2.Name, Up1.Name, Low.Name, NULL

    FROM #MyCausesAS Low

    JOIN #MyCausesAS Up1ON Low.ParentId = Up1.Id

    AND Low.CLev = 4

    JOIN #MyCausesAS Up2ON Up1.ParentId = Up2.Id

    JOIN #MyCausesAS Up3ON Up2.ParentId = Up3.Id

    -- FifthLevel

    UNION

    SELECT Low.Id, Low.ParentId, Up4.Name, Up3.Name, Up2.Name, Up1.Name, Low.Name

    FROM #MyCausesAS Low

    JOIN #MyCausesAS Up1ON Low.ParentId = Up1.Id

    AND Low.CLev = 5

    JOIN #MyCausesAS Up2ON Up1.ParentId = Up2.Id

    JOIN #MyCausesAS Up3ON Up2.ParentId = Up3.Id

    JOIN #MyCausesAS Up4ON Up3.ParentId = Up4.Id)AS Leveled

    JOIN #CoverCausesAS CauseON Leveled.Id = Cause.Id

    DROP TABLE #MyCauses

    DROP TABLE #CoverCauses

  • I see that no one has jumped in on this even though you've provided readily consumable test data.

    First, my response will act as a "bump" and, second, if no one gets to this for you before I get home tonight, I'll see if I can help. I've got a couple of nice tricks up my sleeve for these types of hierarchies.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Ok... first things first. This method needs a good, high performance string splitter. You can find a really good one in the article at the following link in the "Resources" section near the bottom of the article. The splitter is called "DelimitedSplit8K".

    http://www.sqlservercentral.com/articles/Tally+Table/72993/

    The second thing that you need to know is that we use several rather ancient but high performance technologies to do all of this, which keeps things pretty simple. Two of the more important techologies are the Tally Table (I'm using spt_values as a Tally Table) and a dynamic CROSSTAB.

    Read about what a Tally Table is and how it can be used to replace certain types of loops in a high performance manner here...

    http://www.sqlservercentral.com/articles/T-SQL/62867/

    ...read about CROSSTABs here...

    http://www.sqlservercentral.com/articles/T-SQL/63681/

    ...and read about Dynamic CROSSTABs here...

    http://www.sqlservercentral.com/articles/Crosstab/65048/

    You'll need to build the DelimitedSplit8K function I previously mentioned.

    I reworked the code you provided for the test data just 'cuz it's my nature when I decide to keep a copy of code for future reference. I took out the explicit build of the #MyCause table.

    --=======================================================================================

    -- Recreate the test table the OP provided

    --=======================================================================================

    --===== Conditionally drop the test table to make reruns in SSMS easier.

    IF OBJECT_ID('tempdb..#CoverCauses','U') IS NOT NULL DROP TABLE #CoverCauses

    ;

    --===== Create the test table

    CREATE TABLE #CoverCauses

    (

    Id INT NOT NULL PRIMARY KEY CLUSTERED

    ,ParentId INT NOT NULL

    ,Name VARCHAR(50) NOT NULL

    )

    ;--===== Populate the test table with data provided by the OP

    INSERT INTO #CoverCauses

    (Id, ParentId, Name)

    SELECT d.Id, d.ParentId, d.Name

    FROM (

    SELECT 8639, 0,'A' UNION ALL

    SELECT 8640,8639,'AMED' UNION ALL

    SELECT 8641,8640,'AMEDILL' UNION ALL

    SELECT 8642,8640,'AMEDINJ' UNION ALL

    SELECT 8643,8642,'AMEDINJARS' UNION ALL

    SELECT 8644,8642,'AMEDINJADR' UNION ALL

    SELECT 8645,8642,'AMEDINJRVE' UNION ALL

    SELECT 8646,8642,'AMEDINJRTA' UNION ALL

    SELECT 8653,8640,'AMEDDEA' UNION ALL

    SELECT 8654,8653,'AMEDDEAILL' UNION ALL

    SELECT 8655,8653,'AMEDDEAARS' UNION ALL

    SELECT 8656,8653,'AMEDDEAADR' UNION ALL

    SELECT 8657,8653,'AMEDDEARVE' UNION ALL

    SELECT 8658,8653,'AMEDDEARTA' UNION ALL

    SELECT 8665,8639,'ACUR' UNION ALL

    SELECT 8666,8665,'ACURINP' UNION ALL

    SELECT 8667,8666,'ACURINPILL' UNION ALL

    SELECT 8668,8666,'ACURINPINJ' UNION ALL

    SELECT 8669,8668,'ACURINPINJARS' UNION ALL

    SELECT 8670,8668,'ACURINPINJADR' UNION ALL

    SELECT 8671,8668,'ACURINPINJRVE' UNION ALL

    SELECT 8672,8668,'ACURINPINJRTA'

    ) d (Id, ParentId, Name)

    ORDER BY d.Id

    ;

    This solves the problem that you posted.

    --=======================================================================================

    -- Solve the problem of the unknown number of Level columns

    --=======================================================================================

    --===== Conditionally drop the working table to make reruns in SSMS easier.

    IF OBJECT_ID('tempdb..#MyCauses' ,'U') IS NOT NULL DROP TABLE #MyCauses

    ;

    --===== Build the hierarchical path and figure out the level of each node.

    -- I just cleaned up the aliases a bit here. It's almost identical to your

    -- original code. Notice that I added an "hPath" column, which is a simple

    -- concatenation of nodes.

    WITH cteCoverCause AS

    (

    SELECT Id, ParentId, Name, Level = 1, hPath = CAST(Name AS VARCHAR(8000))

    FROM #CoverCauses

    WHERE ParentId = 0

    UNION ALL

    SELECT cc.Id, cc.ParentId, cc.Name, Level = Level + 1, hPath = cte.hPath + '\' + cc.Name

    FROM #CoverCauses AS cc

    JOIN cteCoverCause AS cte

    ON cte.Id = cc.ParentId

    )

    SELECT Id, ParentId, Name, Level, hPath

    INTO #MyCauses

    FROM cteCoverCause

    ;

    --===== Create some obviously named variables

    DECLARE @MaxLevel INT

    ,@SQL VARCHAR(MAX)

    ;

    --===== Figure out how many levels we have

    SELECT @MaxLevel = MAX(Level) FROM #MyCauses

    ;

    --===== Begin the non-dynamic part of the SELECT list

    SELECT @sql = '

    SELECT mc.ID

    ,mc.ParentID

    ,mc.Name'

    ;

    --===== Build the dynamic list of levels using CROSSTAB code.

    -- This uses the spt_values table as a row-number source

    -- from 1 to the max number of columns identified by @MaxLevel

    -- to build CROSSTAB rows for each level in the SELECT list.

    SELECT @sql = @sql + REPLACE(REPLACE('

    ,Level<<N>> = MAX(CASE WHEN split.ItemNumber = <<N>> THEN split.Item ELSE "" END)'

    ,'<<N>>',RIGHT(Number+100,2))

    ,'"' ,'''')

    FROM master.dbo.spt_values

    WHERE Type = 'P'

    AND Number BETWEEN 1 AND @MaxLevel

    ;

    --===== Build the rest of the non-dynamic code.

    -- Notice that the text splitter is used to split the hPath of each node

    -- so that the CROSSTAB code can assign each item to the correct column.

    SELECT @sql = @sql + '

    FROM #MyCauses mc

    CROSS APPLY dbo.DelimitedSplit8K(mc.hPath,''\'') split

    GROUP BY mc.ID, mc.ParentID, mc.Name

    ;'

    --===== Execute the dynamic code to return the desired "ragged" number of levels

    -- for each node in the hierarchy.

    PRINT @sql; --You can comment this line out for production

    EXEC (@SQL);

    Now, before you run off with that, I have to ask, what are you using this for? It's a dynamic solution that will require more dynamic code or manual intervention to use the results. Perhaps something even more flexible and higher performance still is in order. Consider the solutions in the articles at the following 2 links. And, yeah... the timings in the articles are quite accurate. Both methods are incredibly fast compared to traditional methods for doing the same things, if I do say so myself. 😀

    http://www.sqlservercentral.com/articles/Hierarchy/94040/

    http://www.sqlservercentral.com/articles/T-SQL/94570/

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Cheers, Jeff - lots for me to chew over there!

    I'll look through the links and see what they provide. In answer to your question, I refer often to (e.g.) "Second level" and "Third level" values by id all the time, but also the original table in its original form, so I figured the best option would be to make a view with the levels split out. There is a possibility that at some point in the distant future the structure will change and given that the data is so often used, a lot of things would break if the view were to fail, so I wanted dynamism.

    Thanks for the help.

    Mark

  • A view would be good but this requires dynamic SQL to get the dynamic columns. As you probably know, that won't work in a view.

    However, considering how fast the code is, you could create a table with the information in it whenever the data changes. If you build an "alternate" table and have a synonym pointing at the original table and then repoint the synonym to the alternate table when you're done building it, the users will only see a millisecond or two of interuption.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply