Trouble with CTE

  • I'm trying to write a recursive query that's not quite the same as the one in Jeff Moden's article. (Read that, and tried to modify it as necessary... shows I don't understand as much as I thought!)

    The scenario I'm looking at is college courses and prerequisites. And before anybody asks, No, this is not a homework assignment. None of the Certification questions I have read has come close to this one <g>.

    Okay... now that that's over... Here's the basic structure stuff.

    USE [pig]

    GO

    /****** Object: Table [dbo].[Course] Script Date: 03/12/2011 22:03:30 ******/

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    SET ANSI_PADDING ON

    GO

    CREATE TABLE [dbo].[Course](

    [CourseNo] [char](3) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,

    [CourseName] [varchar](30) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,

    CONSTRAINT [PK__Course__7C8480AE] PRIMARY KEY CLUSTERED

    (

    [CourseNo] ASC

    )WITH (IGNORE_DUP_KEY = OFF) ON [PRIMARY]

    ) ON [PRIMARY]

    GO

    SET ANSI_PADDING OFF

    -- CREATE Prereq Table...

    USE [pig]

    GO

    /****** Object: Table [dbo].[Prereq] Script Date: 03/12/2011 22:04:38 ******/

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    SET ANSI_PADDING ON

    GO

    CREATE TABLE [dbo].[Prereq](

    [nextCourseNo] [char](3) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,

    [reqCourseNo] [char](3) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,

    PRIMARY KEY CLUSTERED

    (

    [nextCourseNo] ASC,

    [reqCourseNo] ASC

    )WITH (IGNORE_DUP_KEY = OFF) ON [PRIMARY]

    ) ON [PRIMARY]

    GO

    SET ANSI_PADDING OFF

    GO

    USE [pig]

    GO

    ALTER TABLE [dbo].[Prereq] WITH CHECK ADD CONSTRAINT [FK__Prereq__nextCour__7F60ED59] FOREIGN KEY([nextCourseNo])

    REFERENCES [dbo].[Course] ([CourseNo])

    GO

    ALTER TABLE [dbo].[Prereq] WITH CHECK ADD CONSTRAINT [FK__Prereq__reqCours__00551192] FOREIGN KEY([reqCourseNo])

    REFERENCES [dbo].[Course] ([CourseNo])

    Some data...

    INSERT INTO dbo.Course(CourseNo, CourseName) VALUES (503, 'Sys Analysis and Design');

    INSERT INTO dbo.Course(CourseNo, CourseName) VALUES (601, 'C Programming');

    INSERT INTO dbo.Course(CourseNo, CourseName) VALUES (605, 'Database I');

    INSERT INTO dbo.Course(CourseNo, CourseName) VALUES (606, 'Database II');

    INSERT INTO dbo.Course(CourseNo, CourseName) VALUES (607, 'Database III');

    INSERT INTO dbo.Course(CourseNo, CourseName) VALUES (608, 'HCI');

    INSERT INTO dbo.Course(CourseNo, CourseName) VALUES (620, 'OOAD');

    INSERT INTO dbo.Prereq(nextCourseNo, reqCourseNo) VALUES(605,503);

    INSERT INTO dbo.Prereq(nextCourseNo, reqCourseNo) VALUES(605,601);

    INSERT INTO dbo.Prereq(nextCourseNo, reqCourseNo) VALUES(606,605);

    INSERT INTO dbo.Prereq(nextCourseNo, reqCourseNo) VALUES(607,606);

    INSERT INTO dbo.Prereq(nextCourseNo, reqCourseNo) VALUES(607,620);

    INSERT INTO dbo.Prereq(nextCourseNo, reqCourseNo) VALUES(608,503);

    INSERT INTO dbo.Prereq(nextCourseNo, reqCourseNo) VALUES(620,605);

    INSERT INTO dbo.Prereq(nextCourseNo, reqCourseNo) VALUES(620,608);

    What I wanted to end up with was a graph of prereqs... the courses with no prereqs would be level 1 (601, 503), then the courses with only those as prereqs would be the next level (605, 608)... and so on...

    Here's my ill-fated attempt... (finish your drinks... don't want to be responsible for wet keyboards...)

    USE pig;

    GO

    --- this is root or base. (All courses without prereqs)

    WITH ctePrereqs AS

    ( -- get the "ROOT" row

    SELECT CourseNo, CourseName, CourseLevel = 1

    FROM dbo.Course AS c

    WHERE NOT EXISTS (

    SELECT 1

    FROM dbo.Prereq

    WHERE dbo.Prereq.nextcourseno = c.CourseNo)

    UNION ALL

    -- Recurse through each level of the hierarchy

    SELECT cp.CourseNo, cp.CourseName, CourseLevel = CourseLevel + 1

    FROM dbo.Course AS cp INNER JOIN ctePrereqs p ON cp.CourseNo = p.CourseNo

    )

    SELECT CourseNo, CourseName, CourseLevel

    FROM ctePrereqs

    ORDER BY CourseLevel, CourseNo;

    the Root stuff works perfectly.... the rest fails. I get the Max Recursion error... so clearly my CTE is wrong... but how/where?

    The basic logic

    1. find all courses with no prerequisites. These are "level 1" (N=1)

    2. recurse and find all courses that have only N courses as prerequisites. Label level 2. N=N+1

    I'm sure I'm doing something stupid... I'll try to apply Jeff's directed graph stuff once I get this sorted...

    Thanks!

    Pieter

  • Pieter,

    Using the data you were kind enough to provide in a readily comsumable format, what do you expect for output?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Ah! I may have figured it out.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Ah, fooii. The data doesn't form a "DAG" or "Directed Acyclic Graph". Instead, it has "cycles" in it... in other words, "children with more than one parent" and I'm not so hot in that area simply because I've not had to do such a thing before.

    I know a couple of the heavy hitters on this site have done these types of hierarchies before. Hopefully, one of them will stop by and then we can both learn something.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff,

    I would settle for something like this:

    CourseNo, Level

    503, 1

    601, 1

    605, 2

    608, 2

    620, 3

    The last time I played with this (a LONG time ago!) I got sort of close to what I wanted (in Access, not SQL Server) by having a lot of outer joins between copies of Prereq. My result was something like

    (CourseNo, ReqCourse0, ReqCourse1, ReqCourse2....)

    where 0 required 1 required 2 etc. (all the subscripts above it).

    Right now I'm toying with the idea of using a cursor... just because (1) they'll test me on it and (2) I want to see if it helps at all (doubt it...) I could start with the courses with no prereqs (Course LEFT JOIN Prereq), and then process a level at a time...

    Might need to wait for daylight, though... my brain has had it!

    Thanks for reading this!

    Pieter

  • Here is what I cam up with based on your sample data:

    with CoursePrereqs (

    CourseName,

    CourseNo,

    reqCourseNo,

    nextCourseNo

    )as (

    select

    c.CourseName,

    c.CourseNo,

    p.reqCourseNo,

    p.nextCourseNo

    from

    dbo.Course c

    left outer join dbo.Prereq p

    on (c.CourseNo = p.nextCourseNo)

    ),

    CourseLevels (

    CourseName,

    CourseNo,

    reqCourseNo,

    nextCourseNo,

    CourseLevel

    ) as (

    select

    CourseName,

    CourseNo,

    reqCourseNo,

    nextCourseNo,

    1 as CourseLevel

    from

    CoursePrereqs

    where

    nextCourseNo is null

    union all

    select

    cp.CourseName,

    cp.CourseNo,

    cp.reqCourseNo,

    cp.nextCourseNo,

    CourseLevel + 1

    from

    CoursePrereqs cp

    inner join CourseLevels cl

    on (cl.CourseNo = cp.reqCourseNo)

    )

    select distinct

    CourseName,

    CourseNo,

    CourseLevel

    from

    CourseLevels

    order by

    CourseLevel,

    CourseNo

    ;

  • Wow, cool! With a minor tweak, it works a CHAMP!

    with CoursePrereqs (

    CourseName,

    CourseNo,

    reqCourseNo,

    nextCourseNo

    )as (

    select

    c.CourseName,

    c.CourseNo,

    p.reqCourseNo,

    p.nextCourseNo

    from

    dbo.Course c

    left outer join dbo.Prereq p

    on (c.CourseNo = p.nextCourseNo)

    ),

    CourseLevels (

    CourseName,

    CourseNo,

    reqCourseNo,

    nextCourseNo,

    CourseLevel

    ) as (

    select

    CourseName,

    CourseNo,

    reqCourseNo,

    nextCourseNo,

    1 as CourseLevel

    from

    CoursePrereqs

    where

    nextCourseNo is null

    union all

    select

    cp.CourseName,

    cp.CourseNo,

    cp.reqCourseNo,

    cp.nextCourseNo,

    CourseLevel + 1

    from

    CoursePrereqs cp

    inner join CourseLevels cl

    on (cl.CourseNo = cp.reqCourseNo)

    )

    selectCourseName,

    CourseNo,

    MIN(CourseLevel) As Sequence

    from

    CourseLevels

    group by

    CourseName,

    CourseNo

    order by

    Sequence ASC;

    The Answer (the sequence in which one needs to take courses):

    Sys Analysis and Design,503,1

    C Programming,601,1

    Database I,605,2

    HCI,608,2

    OO Analysis and Design,620,3

    Database II,606,3

    Database III,607,4

    Now all I gotta do is get my head around how it all works. Thanks, Jedi master!

  • Glad I was able to help. Also, thank you for posting your final solution.

  • pietlinden (3/13/2011)


    Now all I gotta do is get my head around how it all works. Thanks, Jedi master!

    Me too. I always seem to lose my mind on non-DAG hierarchies. I'm going to have to study this. Thanks for the code, Lynn.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (3/13/2011)


    pietlinden (3/13/2011)


    Now all I gotta do is get my head around how it all works. Thanks, Jedi master!

    Me too. I always seem to lose my mind on non-DAG hierarchies. I'm going to have to study this. Thanks for the code, Lynn.

    I can't take all the credit, I had help, aka Books Online. I just followed the example there base on the employee data in AdventureWorks. But, isn't that what we experts do, use other examples to see how they can be used to meet other problems?

  • My problem wasn't with making a hierarchical query according to BOL... my problem is that the data the OP provided is cyclic and the examples in BOL don't show how to solve that particular problem. Well done on your part. Now, I have to see what's up with your code to find out how you defeated the cyclic nature of the data. 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Wup-up-OH!! I get it and learned something very new. The data is, in fact, a DAG but it sure didn't look like it. What's cool about all of this to me is, I see how a node can have multiple parents in the data without being cyclic.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (3/13/2011)


    My problem wasn't with making a hierarchical query according to BOL... my problem is that the data the OP provided is cyclic and the examples in BOL don't show how to solve that particular problem. Well done on your part. Now, I have to see what's up with your code to find out how you defeated the cyclic nature of the data. 🙂

    Jeff, didn't say that that was your problem. I only stated that I used the example in the book to write the code I came up with to solve the problem, at least partially since the OP tweaked it a bit for the final solution.

    Thinking about it, it may need some more tweaking if you think about it. What if a course has two prerequisites each at different levels. Wouldn't you want it listed after the highest prereq of the pair?

  • I'm just having a bit of fun with this data because I never worked with double-parent data before. Here's the code I'm using. It's just a reformatted copy of Lynn's code with some modifications to create the Hierarchical Path (which actually explains a whole lot about what's going on).

    --===== This is a GREAT example of how to do multiple parents in a hierarchy

    -- without it being "cyclic". Now all I need to do is figure out how to

    -- constrain it so it's NEVER "cyclic".

    --Here's what the graph for this exercise looks like...

    /*

    /---\ /--- |601| |503|

    \---/ \---/

    | | |

    | +-----+ |

    | | |

    /---\ /--- |605| |608|

    \---/ \---/

    | | |

    | +-----+ |

    | | |

    /---\ /--- |606| |620|

    \---/ \---/

    | |

    | +-------+

    | |

    /--- |607|

    \---/

    */

    USE TempDB

    DROP TABLE dbo.Prereq, dbo.Course

    GO

    --===== A simple table to contain course number and name.

    -- I changes all references to courseNo to be INT instead of VARCHAR(3)

    CREATE TABLE dbo.Course

    (

    CourseNo INT NOT NULL,

    CourseName VARCHAR(30) NOT NULL,

    CONSTRAINT PK_Course PRIMARY KEY CLUSTERED (CourseNo)

    )

    ;

    --===== This forms the relation ships.

    -- The PK ensures no duplicates but there's currently not a constraint

    -- to prevent "cycles".

    CREATE TABLE dbo.Prereq

    (

    NextCourseNo INT NOT NULL,

    ReqCourseNo INT NOT NULL,

    CONSTRAINT PK_Prereq PRIMARY KEY CLUSTERED (NextCourseNo, ReqCourseNo)

    )

    ;

    --===== Here's some additional constraints.

    -- All they do is check to make sure that everything entered is in

    -- the course table

    ALTER TABLE dbo.Prereq

    ADD CONSTRAINT FK_Prereq_Course_NextCourseNo

    FOREIGN KEY(NextCourseNo)

    REFERENCES dbo.Course (CourseNo)

    ;

    ALTER TABLE dbo.Prereq

    ADD CONSTRAINT FK_Prereq_Course_ReqCourseNo

    FOREIGN KEY(ReqCourseNo)

    REFERENCES dbo.Course (CourseNo)

    ;

    --===== Ok, now populate the test tables with data

    INSERT INTO dbo.Course

    (CourseNo, CourseName)

    SELECT 503, 'Sys Analysis and Design' UNION ALL

    SELECT 601, 'C Programming' UNION ALL

    SELECT 605, 'Database I' UNION ALL

    SELECT 606, 'Database II' UNION ALL

    SELECT 607, 'Database III' UNION ALL

    SELECT 608, 'HCI' UNION ALL

    SELECT 620, 'OOAD'

    ;

    --===== This is the "relation" table.

    -- Again, there are no constraints to prevent this from being "cyclic"

    -- and I need to figure that out.

    INSERT INTO dbo.Prereq

    (NextCourseNo, ReqCourseNo)

    SELECT 605,503 UNION ALL --Double Parent

    SELECT 605,601 UNION ALL --Double Parent

    SELECT 606,605 UNION ALL

    SELECT 607,606 UNION ALL --Double Parent

    SELECT 607,620 UNION ALL --Double Parent

    SELECT 608,503 UNION ALL

    SELECT 620,605 UNION ALL --Double Parent

    SELECT 620,608 --Double Parent

    ;

    --------------------------------------------------------------------------------------------------

    WITH

    cteCoursePrereqs AS

    ( --=== Think of this CTE as building the true adjacency list.

    SELECT c.CourseName, c.CourseNo, p.ReqCourseNo, p.NextCourseNo

    FROM dbo.Course c

    LEFT OUTER JOIN dbo.Prereq p ON c.CourseNo = p.NextCourseNo

    ),

    cteCourseLevels AS

    ( --=== The rest is simple if the data is "acyclic"... just do the same ol' recursive CTE that everyone does

    -- to "layer" the hierarchy and dump it into a Hierarchical Path.

    SELECT CourseName = CAST(CourseName AS VARCHAR(30)),

    CourseNo,

    ReqCourseNo,

    --NextCourseNo,

    CourseLevel = 1,

    HPath = CAST(CourseNo AS VARCHAR(30))

    FROM cteCoursePrereqs

    WHERE NextCourseNo IS NULL

    UNION ALL

    SELECT CourseName = CAST(SPACE(cl.CourseLevel*4)+cp.CourseName AS VARCHAR(30)),

    CourseNo = cp.CourseNo,

    ReqCourseNo = cp.ReqCourseNo,

    --NextCourseNo = cp.NextCourseNo,

    CourseLevel = cl.CourseLevel + 1,

    HPath = CAST(cl.HPATH + '\' + CAST(cp.CourseNo AS VARCHAR(30)) AS VARCHAR(30))

    FROM cteCoursePrereqs cp

    INNER JOIN cteCourseLevels cl ON cl.CourseNo = cp.ReqCourseNo

    )

    SELECT * FROM cteCourseLevels ORDER BY Hpath

    Here's the output with the current test data...

    CourseName CourseNo ReqCourseNo CourseLevel HPath

    ------------------------------ ----------- ----------- ----------- ------------------------------

    Sys Analysis and Design 503 NULL 1 503

    Database I 605 503 2 503\605

    Database II 606 605 3 503\605\606

    Database III 607 606 4 503\605\606\607

    OOAD 620 605 3 503\605\620

    Database III 607 620 4 503\605\620\607

    HCI 608 503 2 503\608

    OOAD 620 608 3 503\608\620

    Database III 607 620 4 503\608\620\607

    C Programming 601 NULL 1 601

    Database I 605 601 2 601\605

    Database II 606 605 3 601\605\606

    Database III 607 606 4 601\605\606\607

    OOAD 620 605 3 601\605\620

    Database III 607 620 4 601\605\620\607

    ;

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Lynn Pettis (3/13/2011)


    Jeff Moden (3/13/2011)


    My problem wasn't with making a hierarchical query according to BOL... my problem is that the data the OP provided is cyclic and the examples in BOL don't show how to solve that particular problem. Well done on your part. Now, I have to see what's up with your code to find out how you defeated the cyclic nature of the data. 🙂

    Jeff, didn't say that that was your problem. I only stated that I used the example in the book to write the code I came up with to solve the problem, at least partially since the OP tweaked it a bit for the final solution.

    Thinking about it, it may need some more tweaking if you think about it. What if a course has two prerequisites each at different levels. Wouldn't you want it listed after the highest prereq of the pair?

    Heh... I didn't say you said that was my problem (although I can see how you may have taken it that way). 😉 I was just making a comment in conversation. BWAA-HAA!!! It must be all the Oracle you've had to write lately. 😛

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 18 total)

You must be logged in to reply to this topic. Login to reply