April 1, 2014 at 3:06 am
Hi All,
see code below, using a small sample of data.
I have a table that is formed like Id/ParentId/Name, but I often need to use it in a ways that would be quicker if I had the data split into columns by level. I can do this in the way shown below, but it strikes me that it is a pretty long winded and inflexible method (mostly because we cannot rule out the possibility of new levels being added). Does anyone have any suggestions about how to achieve the same result in a more flexible (and maybe more concise) way?
Any help much appreciated!
Mark
CREATE TABLE #CoverCauses
( IdINTNOT NULL
,ParentIdINTNOT NULL
,NameVARCHAR(50) NOT NULL)
CREATE TABLE #MyCauses
( IdINT
,ParentIdINT
,NameVARCHAR(255)
,CLevINT);
INSERT #CoverCauses
SELECT * FROM
(SELECT 8639 AS [Id],0 AS [ParentId],'A' AS [Code] UNION
SELECT 8640,8639,'AMED' UNION
SELECT 8641,8640,'AMEDILL' UNION
SELECT 8642,8640,'AMEDINJ' UNION
SELECT 8643,8642,'AMEDINJARS' UNION
SELECT 8644,8642,'AMEDINJADR' UNION
SELECT 8645,8642,'AMEDINJRVE' UNION
SELECT 8646,8642,'AMEDINJRTA' UNION
SELECT 8653,8640,'AMEDDEA' UNION
SELECT 8654,8653,'AMEDDEAILL' UNION
SELECT 8655,8653,'AMEDDEAARS' UNION
SELECT 8656,8653,'AMEDDEAADR' UNION
SELECT 8657,8653,'AMEDDEARVE' UNION
SELECT 8658,8653,'AMEDDEARTA' UNION
SELECT 8665,8639,'ACUR' UNION
SELECT 8666,8665,'ACURINP' UNION
SELECT 8667,8666,'ACURINPILL' UNION
SELECT 8668,8666,'ACURINPINJ' UNION
SELECT 8669,8668,'ACURINPINJARS' UNION
SELECT 8670,8668,'ACURINPINJADR' UNION
SELECT 8671,8668,'ACURINPINJRVE' UNION
SELECT 8672,8668,'ACURINPINJRTA') AS Whatever
ORDER BY Id;
With CoverCause AS
(SELECT Id, ParentId, Name, 1 AS Level
FROM #CoverCauses
WHERE ParentId = 0
UNION ALL
SELECT DownWeGo.Id, DownWeGo.ParentId, DownWeGo.Name, Level + 1
FROM #CoverCauses AS DownWeGo
JOIN CoverCauseON CoverCause.Id = DownWeGo.ParentId)
INSERT #MyCauses
SELECT Id, ParentId, Name, Level
FROM CoverCause
GO
/*
To get each level, repeatedly join to id to parent id to step up the "trail" to the top level, ensuring that we start at the bottom level,
using inner joins will ensure that we only start on each level if there is a record there.
*/
SELECT
Leveled.Id
,Leveled.ParentId
,Leveled.TopLevel
,Leveled.SecondLevel
,Leveled.ThirdLevel
,Leveled.FourthLevel
,Leveled.FifthLevel
,Cause.NameAS BottomLevel
FROM
(SELECTId,ParentId, Name AS TopLevel, NULL AS SecondLevel, NULL AS ThirdLevel, NULL AS FourthLevel, NULL AS FifthLevel
FROM #MyCauses
WHERE CLev = 1
UNION
-- SecondLevel
SELECT Low.Id, Low.ParentId, Up1.Name, Low.Name, NULL, NULL,NULL
FROM #MyCausesAS Low
JOIN #MyCausesAS Up1ON Low.ParentId = Up1.Id
AND Low.Clev = 2
UNION
-- ThirdLevel
SELECT Low.Id, Low.ParentId, Up2.Name, Up1.Name, Low.Name, NULL, NULL
FROM #MyCausesAS Low
JOIN #MyCausesAS Up1ON Low.ParentId = Up1.Id
AND Low.CLev =3
JOIN #MyCausesAS Up2ON Up1.ParentId = Up2.Id
UNION
--FourthLevel
SELECT Low.Id, Low.ParentId, Up3.Name, Up2.Name, Up1.Name, Low.Name, NULL
FROM #MyCausesAS Low
JOIN #MyCausesAS Up1ON Low.ParentId = Up1.Id
AND Low.CLev = 4
JOIN #MyCausesAS Up2ON Up1.ParentId = Up2.Id
JOIN #MyCausesAS Up3ON Up2.ParentId = Up3.Id
-- FifthLevel
UNION
SELECT Low.Id, Low.ParentId, Up4.Name, Up3.Name, Up2.Name, Up1.Name, Low.Name
FROM #MyCausesAS Low
JOIN #MyCausesAS Up1ON Low.ParentId = Up1.Id
AND Low.CLev = 5
JOIN #MyCausesAS Up2ON Up1.ParentId = Up2.Id
JOIN #MyCausesAS Up3ON Up2.ParentId = Up3.Id
JOIN #MyCausesAS Up4ON Up3.ParentId = Up4.Id)AS Leveled
JOIN #CoverCausesAS CauseON Leveled.Id = Cause.Id
DROP TABLE #MyCauses
DROP TABLE #CoverCauses
April 2, 2014 at 12:39 pm
I see that no one has jumped in on this even though you've provided readily consumable test data.
First, my response will act as a "bump" and, second, if no one gets to this for you before I get home tonight, I'll see if I can help. I've got a couple of nice tricks up my sleeve for these types of hierarchies.
--Jeff Moden
Change is inevitable... Change for the better is not.
April 2, 2014 at 9:18 pm
Ok... first things first. This method needs a good, high performance string splitter. You can find a really good one in the article at the following link in the "Resources" section near the bottom of the article. The splitter is called "DelimitedSplit8K".
http://www.sqlservercentral.com/articles/Tally+Table/72993/
The second thing that you need to know is that we use several rather ancient but high performance technologies to do all of this, which keeps things pretty simple. Two of the more important techologies are the Tally Table (I'm using spt_values as a Tally Table) and a dynamic CROSSTAB.
Read about what a Tally Table is and how it can be used to replace certain types of loops in a high performance manner here...
http://www.sqlservercentral.com/articles/T-SQL/62867/
...read about CROSSTABs here...
http://www.sqlservercentral.com/articles/T-SQL/63681/
...and read about Dynamic CROSSTABs here...
http://www.sqlservercentral.com/articles/Crosstab/65048/
You'll need to build the DelimitedSplit8K function I previously mentioned.
I reworked the code you provided for the test data just 'cuz it's my nature when I decide to keep a copy of code for future reference. I took out the explicit build of the #MyCause table.
--=======================================================================================
-- Recreate the test table the OP provided
--=======================================================================================
--===== Conditionally drop the test table to make reruns in SSMS easier.
IF OBJECT_ID('tempdb..#CoverCauses','U') IS NOT NULL DROP TABLE #CoverCauses
;
--===== Create the test table
CREATE TABLE #CoverCauses
(
Id INT NOT NULL PRIMARY KEY CLUSTERED
,ParentId INT NOT NULL
,Name VARCHAR(50) NOT NULL
)
;--===== Populate the test table with data provided by the OP
INSERT INTO #CoverCauses
(Id, ParentId, Name)
SELECT d.Id, d.ParentId, d.Name
FROM (
SELECT 8639, 0,'A' UNION ALL
SELECT 8640,8639,'AMED' UNION ALL
SELECT 8641,8640,'AMEDILL' UNION ALL
SELECT 8642,8640,'AMEDINJ' UNION ALL
SELECT 8643,8642,'AMEDINJARS' UNION ALL
SELECT 8644,8642,'AMEDINJADR' UNION ALL
SELECT 8645,8642,'AMEDINJRVE' UNION ALL
SELECT 8646,8642,'AMEDINJRTA' UNION ALL
SELECT 8653,8640,'AMEDDEA' UNION ALL
SELECT 8654,8653,'AMEDDEAILL' UNION ALL
SELECT 8655,8653,'AMEDDEAARS' UNION ALL
SELECT 8656,8653,'AMEDDEAADR' UNION ALL
SELECT 8657,8653,'AMEDDEARVE' UNION ALL
SELECT 8658,8653,'AMEDDEARTA' UNION ALL
SELECT 8665,8639,'ACUR' UNION ALL
SELECT 8666,8665,'ACURINP' UNION ALL
SELECT 8667,8666,'ACURINPILL' UNION ALL
SELECT 8668,8666,'ACURINPINJ' UNION ALL
SELECT 8669,8668,'ACURINPINJARS' UNION ALL
SELECT 8670,8668,'ACURINPINJADR' UNION ALL
SELECT 8671,8668,'ACURINPINJRVE' UNION ALL
SELECT 8672,8668,'ACURINPINJRTA'
) d (Id, ParentId, Name)
ORDER BY d.Id
;
This solves the problem that you posted.
--=======================================================================================
-- Solve the problem of the unknown number of Level columns
--=======================================================================================
--===== Conditionally drop the working table to make reruns in SSMS easier.
IF OBJECT_ID('tempdb..#MyCauses' ,'U') IS NOT NULL DROP TABLE #MyCauses
;
--===== Build the hierarchical path and figure out the level of each node.
-- I just cleaned up the aliases a bit here. It's almost identical to your
-- original code. Notice that I added an "hPath" column, which is a simple
-- concatenation of nodes.
WITH cteCoverCause AS
(
SELECT Id, ParentId, Name, Level = 1, hPath = CAST(Name AS VARCHAR(8000))
FROM #CoverCauses
WHERE ParentId = 0
UNION ALL
SELECT cc.Id, cc.ParentId, cc.Name, Level = Level + 1, hPath = cte.hPath + '\' + cc.Name
FROM #CoverCauses AS cc
JOIN cteCoverCause AS cte
ON cte.Id = cc.ParentId
)
SELECT Id, ParentId, Name, Level, hPath
INTO #MyCauses
FROM cteCoverCause
;
--===== Create some obviously named variables
DECLARE @MaxLevel INT
,@SQL VARCHAR(MAX)
;
--===== Figure out how many levels we have
SELECT @MaxLevel = MAX(Level) FROM #MyCauses
;
--===== Begin the non-dynamic part of the SELECT list
SELECT @SQL = '
SELECT mc.ID
,mc.ParentID
,mc.Name'
;
--===== Build the dynamic list of levels using CROSSTAB code.
-- This uses the spt_values table as a row-number source
-- from 1 to the max number of columns identified by @MaxLevel
-- to build CROSSTAB rows for each level in the SELECT list.
SELECT @SQL = @SQL + REPLACE(REPLACE('
,Level<<N>> = MAX(CASE WHEN split.ItemNumber = <<N>> THEN split.Item ELSE "" END)'
,'<<N>>',RIGHT(Number+100,2))
,'"' ,'''')
FROM master.dbo.spt_values
WHERE Type = 'P'
AND Number BETWEEN 1 AND @MaxLevel
;
--===== Build the rest of the non-dynamic code.
-- Notice that the text splitter is used to split the hPath of each node
-- so that the CROSSTAB code can assign each item to the correct column.
SELECT @SQL = @SQL + '
FROM #MyCauses mc
CROSS APPLY dbo.DelimitedSplit8K(mc.hPath,''\'') split
GROUP BY mc.ID, mc.ParentID, mc.Name
;'
--===== Execute the dynamic code to return the desired "ragged" number of levels
-- for each node in the hierarchy.
PRINT @SQL; --You can comment this line out for production
EXEC (@SQL);
Now, before you run off with that, I have to ask, what are you using this for? It's a dynamic solution that will require more dynamic code or manual intervention to use the results. Perhaps something even more flexible and higher performance still is in order. Consider the solutions in the articles at the following 2 links. And, yeah... the timings in the articles are quite accurate. Both methods are incredibly fast compared to traditional methods for doing the same things, if I do say so myself. 😀
http://www.sqlservercentral.com/articles/Hierarchy/94040/
http://www.sqlservercentral.com/articles/T-SQL/94570/
--Jeff Moden
Change is inevitable... Change for the better is not.
April 3, 2014 at 2:08 am
Cheers, Jeff - lots for me to chew over there!
I'll look through the links and see what they provide. In answer to your question, I refer often to (e.g.) "Second level" and "Third level" values by id all the time, but also the original table in its original form, so I figured the best option would be to make a view with the levels split out. There is a possibility that at some point in the distant future the structure will change and given that the data is so often used, a lot of things would break if the view were to fail, so I wanted dynamism.
Thanks for the help.
Mark
April 3, 2014 at 7:53 am
A view would be good but this requires dynamic SQL to get the dynamic columns. As you probably know, that won't work in a view.
However, considering how fast the code is, you could create a table with the information in it whenever the data changes. If you build an "alternate" table and have a synonym pointing at the original table and then repoint the synonym to the alternate table when you're done building it, the users will only see a millisecond or two of interuption.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply