Flexible Recursive Level Splitting - One column per level

Question

Flexible Recursive Level Splitting - One column per level

High Plains Grifter

Ten Centuries

Points: 1191
More actions
April 1, 2014 at 3:06 am

#289844

Hi All,
see code below, using a small sample of data.
I have a table that is formed like Id/ParentId/Name, but I often need to use it in a ways that would be quicker if I had the data split into columns by level. I can do this in the way shown below, but it strikes me that it is a pretty long winded and inflexible method (mostly because we cannot rule out the possibility of new levels being added). Does anyone have any suggestions about how to achieve the same result in a more flexible (and maybe more concise) way?
Any help much appreciated!
Mark
CREATE TABLE #CoverCauses
( IdINTNOT NULL
,ParentIdINTNOT NULL
,NameVARCHAR(50) NOT NULL)
CREATE TABLE #MyCauses
( IdINT
,ParentIdINT
,NameVARCHAR(255)
,CLevINT);
INSERT #CoverCauses
SELECT * FROM
(SELECT 8639 AS [Id],0 AS [ParentId],'A' AS [Code] UNION
SELECT 8640,8639,'AMED' UNION
SELECT 8641,8640,'AMEDILL' UNION
SELECT 8642,8640,'AMEDINJ' UNION
SELECT 8643,8642,'AMEDINJARS' UNION
SELECT 8644,8642,'AMEDINJADR' UNION
SELECT 8645,8642,'AMEDINJRVE' UNION
SELECT 8646,8642,'AMEDINJRTA' UNION
SELECT 8653,8640,'AMEDDEA' UNION
SELECT 8654,8653,'AMEDDEAILL' UNION
SELECT 8655,8653,'AMEDDEAARS' UNION
SELECT 8656,8653,'AMEDDEAADR' UNION
SELECT 8657,8653,'AMEDDEARVE' UNION
SELECT 8658,8653,'AMEDDEARTA' UNION
SELECT 8665,8639,'ACUR' UNION
SELECT 8666,8665,'ACURINP' UNION
SELECT 8667,8666,'ACURINPILL' UNION
SELECT 8668,8666,'ACURINPINJ' UNION
SELECT 8669,8668,'ACURINPINJARS' UNION
SELECT 8670,8668,'ACURINPINJADR' UNION
SELECT 8671,8668,'ACURINPINJRVE' UNION
SELECT 8672,8668,'ACURINPINJRTA') AS Whatever
ORDER BY Id;
With CoverCause AS
(SELECT Id, ParentId, Name, 1 AS Level
FROM #CoverCauses
WHERE ParentId = 0
UNION ALL
SELECT DownWeGo.Id, DownWeGo.ParentId, DownWeGo.Name, Level + 1
FROM #CoverCauses AS DownWeGo
JOIN CoverCauseON CoverCause.Id = DownWeGo.ParentId)
INSERT #MyCauses
SELECT Id, ParentId, Name, Level
FROM CoverCause
GO
/*
To get each level, repeatedly join to id to parent id to step up the "trail" to the top level, ensuring that we start at the bottom level,
using inner joins will ensure that we only start on each level if there is a record there.
*/
SELECT
Leveled.Id
,Leveled.ParentId
,Leveled.TopLevel
,Leveled.SecondLevel
,Leveled.ThirdLevel
,Leveled.FourthLevel
,Leveled.FifthLevel
,Cause.NameAS BottomLevel
FROM
(SELECTId,ParentId, Name AS TopLevel, NULL AS SecondLevel, NULL AS ThirdLevel, NULL AS FourthLevel, NULL AS FifthLevel
FROM #MyCauses
WHERE CLev = 1
UNION
-- SecondLevel
SELECT Low.Id, Low.ParentId, Up1.Name, Low.Name, NULL, NULL,NULL
FROM #MyCausesAS Low
JOIN #MyCausesAS Up1ON Low.ParentId = Up1.Id
AND Low.Clev = 2
UNION
-- ThirdLevel
SELECT Low.Id, Low.ParentId, Up2.Name, Up1.Name, Low.Name, NULL, NULL
FROM #MyCausesAS Low
JOIN #MyCausesAS Up1ON Low.ParentId = Up1.Id
AND Low.CLev =3
JOIN #MyCausesAS Up2ON Up1.ParentId = Up2.Id
UNION
--FourthLevel
SELECT Low.Id, Low.ParentId, Up3.Name, Up2.Name, Up1.Name, Low.Name, NULL
FROM #MyCausesAS Low
JOIN #MyCausesAS Up1ON Low.ParentId = Up1.Id
AND Low.CLev = 4
JOIN #MyCausesAS Up2ON Up1.ParentId = Up2.Id
JOIN #MyCausesAS Up3ON Up2.ParentId = Up3.Id
-- FifthLevel
UNION
SELECT Low.Id, Low.ParentId, Up4.Name, Up3.Name, Up2.Name, Up1.Name, Low.Name
FROM #MyCausesAS Low
JOIN #MyCausesAS Up1ON Low.ParentId = Up1.Id
AND Low.CLev = 5
JOIN #MyCausesAS Up2ON Up1.ParentId = Up2.Id
JOIN #MyCausesAS Up3ON Up2.ParentId = Up3.Id
JOIN #MyCausesAS Up4ON Up3.ParentId = Up4.Id)AS Leveled
JOIN #CoverCausesAS CauseON Leveled.Id = Cause.Id
DROP TABLE #MyCauses
DROP TABLE #CoverCauses

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1004403 More actions · Answer 1

I see that no one has jumped in on this even though you've provided readily consumable test data.

First, my response will act as a "bump" and, second, if no one gets to this for you before I get home tonight, I'll see if I can help. I've got a couple of nice tricks up my sleeve for these types of hierarchies.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004403 More actions · Answer 2

Ok... first things first. This method needs a good, high performance string splitter. You can find a really good one in the article at the following link in the "Resources" section near the bottom of the article. The splitter is called "DelimitedSplit8K".

http://www.sqlservercentral.com/articles/Tally+Table/72993/

The second thing that you need to know is that we use several rather ancient but high performance technologies to do all of this, which keeps things pretty simple. Two of the more important techologies are the Tally Table (I'm using spt_values as a Tally Table) and a dynamic CROSSTAB.

Read about what a Tally Table is and how it can be used to replace certain types of loops in a high performance manner here...

http://www.sqlservercentral.com/articles/T-SQL/62867/

...read about CROSSTABs here...

http://www.sqlservercentral.com/articles/T-SQL/63681/

...and read about Dynamic CROSSTABs here...

http://www.sqlservercentral.com/articles/Crosstab/65048/

You'll need to build the DelimitedSplit8K function I previously mentioned.

I reworked the code you provided for the test data just 'cuz it's my nature when I decide to keep a copy of code for future reference. I took out the explicit build of the #MyCause table.

--=======================================================================================

-- Recreate the test table the OP provided

--=======================================================================================

--===== Conditionally drop the test table to make reruns in SSMS easier.

IF OBJECT_ID('tempdb..#CoverCauses','U') IS NOT NULL DROP TABLE #CoverCauses

;

--===== Create the test table

CREATE TABLE #CoverCauses

(

Id INT NOT NULL PRIMARY KEY CLUSTERED

,ParentId INT NOT NULL

,Name VARCHAR(50) NOT NULL

)

;--===== Populate the test table with data provided by the OP

INSERT INTO #CoverCauses

(Id, ParentId, Name)

SELECT d.Id, d.ParentId, d.Name

FROM (

SELECT 8639, 0,'A' UNION ALL

SELECT 8640,8639,'AMED' UNION ALL

SELECT 8641,8640,'AMEDILL' UNION ALL

SELECT 8642,8640,'AMEDINJ' UNION ALL

SELECT 8643,8642,'AMEDINJARS' UNION ALL

SELECT 8644,8642,'AMEDINJADR' UNION ALL

SELECT 8645,8642,'AMEDINJRVE' UNION ALL

SELECT 8646,8642,'AMEDINJRTA' UNION ALL

SELECT 8653,8640,'AMEDDEA' UNION ALL

SELECT 8654,8653,'AMEDDEAILL' UNION ALL

SELECT 8655,8653,'AMEDDEAARS' UNION ALL

SELECT 8656,8653,'AMEDDEAADR' UNION ALL

SELECT 8657,8653,'AMEDDEARVE' UNION ALL

SELECT 8658,8653,'AMEDDEARTA' UNION ALL

SELECT 8665,8639,'ACUR' UNION ALL

SELECT 8666,8665,'ACURINP' UNION ALL

SELECT 8667,8666,'ACURINPILL' UNION ALL

SELECT 8668,8666,'ACURINPINJ' UNION ALL

SELECT 8669,8668,'ACURINPINJARS' UNION ALL

SELECT 8670,8668,'ACURINPINJADR' UNION ALL

SELECT 8671,8668,'ACURINPINJRVE' UNION ALL

SELECT 8672,8668,'ACURINPINJRTA'

) d (Id, ParentId, Name)

ORDER BY d.Id

;

This solves the problem that you posted.

--=======================================================================================

-- Solve the problem of the unknown number of Level columns

--=======================================================================================

--===== Conditionally drop the working table to make reruns in SSMS easier.

IF OBJECT_ID('tempdb..#MyCauses' ,'U') IS NOT NULL DROP TABLE #MyCauses

;

--===== Build the hierarchical path and figure out the level of each node.

-- I just cleaned up the aliases a bit here. It's almost identical to your

-- original code. Notice that I added an "hPath" column, which is a simple

-- concatenation of nodes.

WITH cteCoverCause AS

(

SELECT Id, ParentId, Name, Level = 1, hPath = CAST(Name AS VARCHAR(8000))

FROM #CoverCauses

WHERE ParentId = 0

UNION ALL

SELECT cc.Id, cc.ParentId, cc.Name, Level = Level + 1, hPath = cte.hPath + '\' + cc.Name

FROM #CoverCauses AS cc

JOIN cteCoverCause AS cte

ON cte.Id = cc.ParentId

)

SELECT Id, ParentId, Name, Level, hPath

INTO #MyCauses

FROM cteCoverCause

;

--===== Create some obviously named variables

DECLARE @MaxLevel INT

,@SQL VARCHAR(MAX)

;

--===== Figure out how many levels we have

SELECT @MaxLevel = MAX(Level) FROM #MyCauses

;

--===== Begin the non-dynamic part of the SELECT list

SELECT @sql = '

SELECT mc.ID

,mc.ParentID

,mc.Name'

;

--===== Build the dynamic list of levels using CROSSTAB code.

-- This uses the spt_values table as a row-number source

-- from 1 to the max number of columns identified by @MaxLevel

-- to build CROSSTAB rows for each level in the SELECT list.

SELECT @sql = @sql + REPLACE(REPLACE('

,Level<<N>> = MAX(CASE WHEN split.ItemNumber = <<N>> THEN split.Item ELSE "" END)'

,'<<N>>',RIGHT(Number+100,2))

,'"' ,'''')

FROM master.dbo.spt_values

WHERE Type = 'P'

AND Number BETWEEN 1 AND @MaxLevel

;

--===== Build the rest of the non-dynamic code.

-- Notice that the text splitter is used to split the hPath of each node

-- so that the CROSSTAB code can assign each item to the correct column.

SELECT @sql = @sql + '

FROM #MyCauses mc

CROSS APPLY dbo.DelimitedSplit8K(mc.hPath,''\'') split

GROUP BY mc.ID, mc.ParentID, mc.Name

;'

--===== Execute the dynamic code to return the desired "ragged" number of levels

-- for each node in the hierarchy.

PRINT @sql; --You can comment this line out for production

EXEC (@SQL);

Now, before you run off with that, I have to ask, what are you using this for? It's a dynamic solution that will require more dynamic code or manual intervention to use the results. Perhaps something even more flexible and higher performance still is in order. Consider the solutions in the articles at the following 2 links. And, yeah... the timings in the articles are quite accurate. Both methods are incredibly fast compared to traditional methods for doing the same things, if I do say so myself. 😀

http://www.sqlservercentral.com/articles/Hierarchy/94040/

http://www.sqlservercentral.com/articles/T-SQL/94570/

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

High Plains Grifter Ten Centuries Points: 1191 More actions · Answer 3

Cheers, Jeff - lots for me to chew over there!

I'll look through the links and see what they provide. In answer to your question, I refer often to (e.g.) "Second level" and "Third level" values by id all the time, but also the original table in its original form, so I figured the best option would be to make a view with the levels split out. There is a possibility that at some point in the distant future the structure will change and given that the data is so often used, a lot of things would break if the view were to fail, so I wanted dynamism.

Thanks for the help.

Mark

Jeff Moden SSC Guru Points: 1004403 More actions · Answer 4

A view would be good but this requires dynamic SQL to get the dynamic columns. As you probably know, that won't work in a view.

However, considering how fast the code is, you could create a table with the information in it whenever the data changes. If you build an "alternate" table and have a synonym pointing at the original table and then repoint the synonym to the alternate table when you're done building it, the users will only see a millisecond or two of interuption.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)