T-SQL join to same table n number of times

  • Hi,

    I have a table where each row has an item and the item which superseeds it. There can be many items in the supersession chain.

    I.e.

    Row 1 - Item1 to item2,

    Row 2 - item2 to item3,

    Row 3 - item3 to item4

    etc

    I need to write a query to get the first and last item in the chain for all supersessions (i.e. in the exampe above return item1 and item4), however you do not know how many items are in the chain each time.

    Any advice is appreciated! thanks.

    🙂

  • Hi Jend

    You gave too less details to allow judging what you are looking for but

    imho you can do that like this:

    ;WITH ItemSrch AS (

    SELECT Prev_Item_Code, Next_Item_Code

    FROM YourTable

    UNION ALL

    SELECT a.Prev_Item_Code, b.Next_Item_Code

    FROM YourTable b

    INNER JOIN ItemSrch a ON a.Next_Item_Code = b.Prev_Item_Code

    )

    SELECT *

    FROM ItemSrch;

    Regards

    Mike

  • Not enough to really provide much help. First thought would be to use a recursive CTE to get the data.

    If you would like more help, please read the first article that I reference below in my signature block. It will show you what you need to post and how to post it to get the best possible answers in return. The bonus for the extra effort on your part, you will also get tested code in return.

  • Apologies for not being clear. This is a trimmed down table definition and some example data...

    CREATE TABLE [dbo].[ITEM_SUPERSESSION](

    [Superseeded_Item] [char](15) NOT NULL,

    [Superseeding_Item] [char](15) NOT NULL,

    CONSTRAINT [PK_ITEM_SUPERSESSION] PRIMARY KEY CLUSTERED

    ([Superseeded_Item] ASC,

    [Superseeding_Item] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]

    ) ON [PRIMARY]

    GO

    INSERT INTO [ITEM_SUPERSESSION]

    ([Superseeded_Item], [Superseeding_Item])

    SELECT 'A1','B2' UNION ALL

    SELECT 'B2','C3' UNION ALL

    SELECT 'C3','D4' UNION ALL

    SELECT 'E1','F2' UNION ALL

    SELECT 'F2','G3' UNION ALL

    SELECT 'G3','H4' UNION ALL

    SELECT 'H4','I5' UNION ALL

    SELECT 'I5','J6' UNION ALL

    SELECT 'J6','K7' UNION ALL

    SELECT 'X1','Y1'

    GO

    If possible, in this example I would like write a query to return the following -

    Superseeded_Item Final_Superseeding_Item

    'A1', 'D4'

    'B2', 'D4'

    'C3', 'D4'

    'E1', 'K7'

    'F2', 'K7'

    'G3', 'K7'

    'H4', 'K7'

    'I5', 'K7'

    'J6', 'K7'

    'X1', 'Y1'

  • As it turns out, I've been experimenting of late with product substitution tables and so far I have something that may work for you. Note that I have not finished trying to optimize this ugly baby for performance (you've been warned!).

    WITH X AS

    (

    SELECT [Superseeded_Item], [Superseeding_Item]

    FROM [ITEM_SUPERSESSION]

    UNION ALL

    SELECT [Superseeding_Item], [Superseeded_Item]

    FROM [ITEM_SUPERSESSION]

    )

    SELECT [Superseeded_Item], [Superseeding_Item]

    INTO #Items

    FROM

    (

    SELECT [Superseeded_Item], [Superseeding_Item]

    FROM X

    UNION

    SELECT a.[Superseeded_Item], b.[Superseeding_Item]

    FROM X a

    JOIN X b ON a.[Superseeding_Item] = b.[Superseeded_Item]

    WHERE a.[Superseeded_Item] <> b.[Superseeding_Item]

    ) a;

    WHILE @@ROWCOUNT <> 0

    INSERT INTO #Items

    SELECT a.[Superseeded_Item], b.[Superseeding_Item]

    FROM #Items a

    JOIN #Items b ON a.[Superseeding_Item] = b.[Superseeded_Item]

    WHERE a.[Superseeded_Item] <> b.[Superseeding_Item]

    EXCEPT -- Remove items already saved

    SELECT [Superseeded_Item], [Superseeding_Item]

    FROM #Items;

    WITH SuperseededItems AS

    (

    SELECT DISTINCT [Superseeded_Item]=MIN(b.[Superseeded_Item])

    FROM

    (

    SELECT [Superseeded_Item], [Superseeding_Item]

    ,rn=ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    FROM #Items

    ) a

    CROSS APPLY

    (

    SELECT [Superseeded_Item], [Superseeding_Item]

    FROM #Items b

    WHERE a.[Superseeded_Item] IN (b.[Superseeded_Item], b.[Superseeding_Item])

    ) b

    GROUP BY rn

    )

    ,SuperseedingItems AS

    (

    SELECT DISTINCT [Superseeding_Item]=MAX(b.[Superseeding_Item])

    FROM

    (

    SELECT [Superseeded_Item], [Superseeding_Item]

    ,rn=ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    FROM #Items

    ) a

    CROSS APPLY

    (

    SELECT [Superseeded_Item], [Superseeding_Item]

    FROM #Items b

    WHERE a.[Superseeding_Item] IN (b.[Superseeded_Item], b.[Superseeding_Item])

    ) b

    GROUP BY rn

    )

    SELECT [Superseeded_Item], [Superseeding_Item]

    FROM

    (

    SELECT [Superseeded_Item]

    ,rn=ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    FROM SuperseededItems

    ) a

    JOIN

    (

    SELECT [Superseeding_Item]

    ,rn=ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    FROM SuperseedingItems

    ) b ON a.rn = b.rn;

    GO

    DROP TABLE #Items;

    It may be an ugly baby but it's my ugly baby!

    Edit: Note also there is one other thing I have not fully tested. This approach may require that your item IDs be in ascending sequence as you progress through the chain. For your test data, this is the case.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • jend (11/20/2013)


    Apologies for not being clear. This is a trimmed down table definition and some example data...

    CREATE TABLE [dbo].[ITEM_SUPERSESSION](

    [Superseeded_Item] [char](15) NOT NULL,

    [Superseeding_Item] [char](15) NOT NULL,

    CONSTRAINT [PK_ITEM_SUPERSESSION] PRIMARY KEY CLUSTERED

    ([Superseeded_Item] ASC,

    [Superseeding_Item] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]

    ) ON [PRIMARY]

    GO

    INSERT INTO [ITEM_SUPERSESSION]

    ([Superseeded_Item], [Superseeding_Item])

    SELECT 'A1','B2' UNION ALL

    SELECT 'B2','C3' UNION ALL

    SELECT 'C3','D4' UNION ALL

    SELECT 'E1','F2' UNION ALL

    SELECT 'F2','G3' UNION ALL

    SELECT 'G3','H4' UNION ALL

    SELECT 'H4','I5' UNION ALL

    SELECT 'I5','J6' UNION ALL

    SELECT 'J6','K7' UNION ALL

    SELECT 'X1','Y1'

    GO

    If possible, in this example I would like write a query to return the following -

    Superseeded_Item Final_Superseeding_Item

    'A1', 'D4'

    'B2', 'D4'

    'C3', 'D4'

    'E1', 'K7'

    'F2', 'K7'

    'G3', 'K7'

    'H4', 'K7'

    'I5', 'K7'

    'J6', 'K7'

    'X1', 'Y1'

    First things first. In the following test setup, I reversed the column order of your PK to prevent some pretty nasty "accidental" internal cross-joins in the code that follows this...

    CREATE TABLE dbo.ITEM_SUPERSESSION

    (

    Superseeded_Item char(15) NOT NULL, --Parent

    Superseeding_Item char(15) NOT NULL, --Child

    CONSTRAINT PK_ITEM_SUPERSESSION PRIMARY KEY CLUSTERED

    ( --===== LOOK!!!! This has been reversed!!!!

    Superseeding_Item ASC,

    Superseeded_Item ASC

    )

    )

    ;

    GO

    INSERT INTO ITEM_SUPERSESSION

    (Superseeded_Item, Superseeding_Item)

    SELECT 'A1','B2' UNION ALL

    SELECT 'B2','C3' UNION ALL

    SELECT 'C3','D4' UNION ALL

    SELECT 'E1','F2' UNION ALL

    SELECT 'F2','G3' UNION ALL

    SELECT 'G3','H4' UNION ALL

    SELECT 'H4','I5' UNION ALL

    SELECT 'I5','J6' UNION ALL

    SELECT 'J6','K7' UNION ALL

    SELECT 'X1','Y1'

    ;

    GO

    This code produces the desired result by first finding all of the "Leaf Nodes" and then walking though the forest of 3 trees leaving breadcrumbs in the form of those "Leaf Nodes". Except for hierarchical solutions, I pretty much avoid recursion, so someone who works with it on a regular basis will likely make my code look pretty stupid for an execution plan standpoint.

    WITH

    cteLeafLevel AS

    ( --=== Find the "Leaf Level" where items have not been superseded

    SELECT Superseeded_Item, Superseeding_Item

    FROM dbo.ITEM_SUPERSESSION child

    WHERE Superseeding_Item NOT IN (SELECT Superseeded_Item FROM dbo.ITEM_SUPERSESSION)

    ),

    cteTraverse AS

    ( --=== Recursive CTE travels from "Leaf" to "Root" leaving the leaf level at all levels

    SELECT Superseeded_Item, Superseeding_Item

    FROM cteLeafLevel

    UNION ALL

    SELECT b.Superseeded_Item, cte.Superseeding_Item --This line is the key

    FROM dbo.ITEM_SUPERSESSION b

    INNER JOIN cteTraverse cte ON cte.Superseeded_Item = b.Superseeding_Item

    )

    SELECT Superseeded_Item, FinalSuperseeding_Item = Superseeding_Item

    FROM cteTraverse

    ORDER BY Superseeding_Item, Superseeded_Item

    ;

    Here are what I get for the results from all that.

    Superseeded_Item FinalSuperseeding_Item

    ---------------- ----------------------

    A1 D4

    B2 D4

    C3 D4

    E1 K7

    F2 K7

    G3 K7

    H4 K7

    I5 K7

    J6 K7

    X1 Y1

    (10 row(s) affected)

    The problem with all of that is the results aren't stored anywhere so you'd need to run this more than once AND you go through all that recursion to solve just one problem. The original data you provided is nothing more than an Adjacency List and, with a little inspiration from the following articles, you could do so much more.

    http://www.sqlservercentral.com/articles/Hierarchy/94040/

    http://www.sqlservercentral.com/articles/T-SQL/94570/

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thats a massive help, thanks everyone and I will take a look at the articles you have suggested.

    thanks again

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply