March 12, 2011 at 9:20 pm
I'm trying to write a recursive query that's not quite the same as the one in Jeff Moden's article. (Read that, and tried to modify it as necessary... shows I don't understand as much as I thought!)
The scenario I'm looking at is college courses and prerequisites. And before anybody asks, No, this is not a homework assignment. None of the Certification questions I have read has come close to this one <g>.
Okay... now that that's over... Here's the basic structure stuff.
USE [pig]
GO
/****** Object: Table [dbo].[Course] Script Date: 03/12/2011 22:03:30 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING ON
GO
CREATE TABLE [dbo].[Course](
[CourseNo] [char](3) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
[CourseName] [varchar](30) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
CONSTRAINT [PK__Course__7C8480AE] PRIMARY KEY CLUSTERED
(
[CourseNo] ASC
)WITH (IGNORE_DUP_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
-- CREATE Prereq Table...
USE [pig]
GO
/****** Object: Table [dbo].[Prereq] Script Date: 03/12/2011 22:04:38 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING ON
GO
CREATE TABLE [dbo].[Prereq](
[nextCourseNo] [char](3) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
[reqCourseNo] [char](3) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
PRIMARY KEY CLUSTERED
(
[nextCourseNo] ASC,
[reqCourseNo] ASC
)WITH (IGNORE_DUP_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO
USE [pig]
GO
ALTER TABLE [dbo].[Prereq] WITH CHECK ADD CONSTRAINT [FK__Prereq__nextCour__7F60ED59] FOREIGN KEY([nextCourseNo])
REFERENCES [dbo].[Course] ([CourseNo])
GO
ALTER TABLE [dbo].[Prereq] WITH CHECK ADD CONSTRAINT [FK__Prereq__reqCours__00551192] FOREIGN KEY([reqCourseNo])
REFERENCES [dbo].[Course] ([CourseNo])
Some data...
INSERT INTO dbo.Course(CourseNo, CourseName) VALUES (503, 'Sys Analysis and Design');
INSERT INTO dbo.Course(CourseNo, CourseName) VALUES (601, 'C Programming');
INSERT INTO dbo.Course(CourseNo, CourseName) VALUES (605, 'Database I');
INSERT INTO dbo.Course(CourseNo, CourseName) VALUES (606, 'Database II');
INSERT INTO dbo.Course(CourseNo, CourseName) VALUES (607, 'Database III');
INSERT INTO dbo.Course(CourseNo, CourseName) VALUES (608, 'HCI');
INSERT INTO dbo.Course(CourseNo, CourseName) VALUES (620, 'OOAD');
INSERT INTO dbo.Prereq(nextCourseNo, reqCourseNo) VALUES(605,503);
INSERT INTO dbo.Prereq(nextCourseNo, reqCourseNo) VALUES(605,601);
INSERT INTO dbo.Prereq(nextCourseNo, reqCourseNo) VALUES(606,605);
INSERT INTO dbo.Prereq(nextCourseNo, reqCourseNo) VALUES(607,606);
INSERT INTO dbo.Prereq(nextCourseNo, reqCourseNo) VALUES(607,620);
INSERT INTO dbo.Prereq(nextCourseNo, reqCourseNo) VALUES(608,503);
INSERT INTO dbo.Prereq(nextCourseNo, reqCourseNo) VALUES(620,605);
INSERT INTO dbo.Prereq(nextCourseNo, reqCourseNo) VALUES(620,608);
What I wanted to end up with was a graph of prereqs... the courses with no prereqs would be level 1 (601, 503), then the courses with only those as prereqs would be the next level (605, 608)... and so on...
Here's my ill-fated attempt... (finish your drinks... don't want to be responsible for wet keyboards...)
USE pig;
GO
--- this is root or base. (All courses without prereqs)
WITH ctePrereqs AS
( -- get the "ROOT" row
SELECT CourseNo, CourseName, CourseLevel = 1
FROM dbo.Course AS c
WHERE NOT EXISTS (
SELECT 1
FROM dbo.Prereq
WHERE dbo.Prereq.nextcourseno = c.CourseNo)
UNION ALL
-- Recurse through each level of the hierarchy
SELECT cp.CourseNo, cp.CourseName, CourseLevel = CourseLevel + 1
FROM dbo.Course AS cp INNER JOIN ctePrereqs p ON cp.CourseNo = p.CourseNo
)
SELECT CourseNo, CourseName, CourseLevel
FROM ctePrereqs
ORDER BY CourseLevel, CourseNo;
the Root stuff works perfectly.... the rest fails. I get the Max Recursion error... so clearly my CTE is wrong... but how/where?
The basic logic
1. find all courses with no prerequisites. These are "level 1" (N=1)
2. recurse and find all courses that have only N courses as prerequisites. Label level 2. N=N+1
I'm sure I'm doing something stupid... I'll try to apply Jeff's directed graph stuff once I get this sorted...
Thanks!
Pieter
March 12, 2011 at 10:10 pm
Pieter,
Using the data you were kind enough to provide in a readily comsumable format, what do you expect for output?
--Jeff Moden
Change is inevitable... Change for the better is not.
March 12, 2011 at 10:11 pm
Ah! I may have figured it out.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 12, 2011 at 10:19 pm
Ah, fooii. The data doesn't form a "DAG" or "Directed Acyclic Graph". Instead, it has "cycles" in it... in other words, "children with more than one parent" and I'm not so hot in that area simply because I've not had to do such a thing before.
I know a couple of the heavy hitters on this site have done these types of hierarchies before. Hopefully, one of them will stop by and then we can both learn something.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 13, 2011 at 12:34 am
Jeff,
I would settle for something like this:
CourseNo, Level
503, 1
601, 1
605, 2
608, 2
620, 3
The last time I played with this (a LONG time ago!) I got sort of close to what I wanted (in Access, not SQL Server) by having a lot of outer joins between copies of Prereq. My result was something like
(CourseNo, ReqCourse0, ReqCourse1, ReqCourse2....)
where 0 required 1 required 2 etc. (all the subscripts above it).
Right now I'm toying with the idea of using a cursor... just because (1) they'll test me on it and (2) I want to see if it helps at all (doubt it...) I could start with the courses with no prereqs (Course LEFT JOIN Prereq), and then process a level at a time...
Might need to wait for daylight, though... my brain has had it!
Thanks for reading this!
Pieter
March 13, 2011 at 5:12 am
Here is what I cam up with based on your sample data:
with CoursePrereqs (
CourseName,
CourseNo,
reqCourseNo,
nextCourseNo
)as (
select
c.CourseName,
c.CourseNo,
p.reqCourseNo,
p.nextCourseNo
from
dbo.Course c
left outer join dbo.Prereq p
on (c.CourseNo = p.nextCourseNo)
),
CourseLevels (
CourseName,
CourseNo,
reqCourseNo,
nextCourseNo,
CourseLevel
) as (
select
CourseName,
CourseNo,
reqCourseNo,
nextCourseNo,
1 as CourseLevel
from
CoursePrereqs
where
nextCourseNo is null
union all
select
cp.CourseName,
cp.CourseNo,
cp.reqCourseNo,
cp.nextCourseNo,
CourseLevel + 1
from
CoursePrereqs cp
inner join CourseLevels cl
on (cl.CourseNo = cp.reqCourseNo)
)
select distinct
CourseName,
CourseNo,
CourseLevel
from
CourseLevels
order by
CourseLevel,
CourseNo
;
March 13, 2011 at 9:57 am
Wow, cool! With a minor tweak, it works a CHAMP!
with CoursePrereqs (
CourseName,
CourseNo,
reqCourseNo,
nextCourseNo
)as (
select
c.CourseName,
c.CourseNo,
p.reqCourseNo,
p.nextCourseNo
from
dbo.Course c
left outer join dbo.Prereq p
on (c.CourseNo = p.nextCourseNo)
),
CourseLevels (
CourseName,
CourseNo,
reqCourseNo,
nextCourseNo,
CourseLevel
) as (
select
CourseName,
CourseNo,
reqCourseNo,
nextCourseNo,
1 as CourseLevel
from
CoursePrereqs
where
nextCourseNo is null
union all
select
cp.CourseName,
cp.CourseNo,
cp.reqCourseNo,
cp.nextCourseNo,
CourseLevel + 1
from
CoursePrereqs cp
inner join CourseLevels cl
on (cl.CourseNo = cp.reqCourseNo)
)
selectCourseName,
CourseNo,
MIN(CourseLevel) As Sequence
from
CourseLevels
group by
CourseName,
CourseNo
order by
Sequence ASC;
The Answer (the sequence in which one needs to take courses):
Sys Analysis and Design,503,1
C Programming,601,1
Database I,605,2
HCI,608,2
OO Analysis and Design,620,3
Database II,606,3
Database III,607,4
Now all I gotta do is get my head around how it all works. Thanks, Jedi master!
March 13, 2011 at 10:09 am
Glad I was able to help. Also, thank you for posting your final solution.
March 13, 2011 at 3:19 pm
pietlinden (3/13/2011)
Now all I gotta do is get my head around how it all works. Thanks, Jedi master!
Me too. I always seem to lose my mind on non-DAG hierarchies. I'm going to have to study this. Thanks for the code, Lynn.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 13, 2011 at 4:23 pm
Jeff Moden (3/13/2011)
pietlinden (3/13/2011)
Now all I gotta do is get my head around how it all works. Thanks, Jedi master!Me too. I always seem to lose my mind on non-DAG hierarchies. I'm going to have to study this. Thanks for the code, Lynn.
I can't take all the credit, I had help, aka Books Online. I just followed the example there base on the employee data in AdventureWorks. But, isn't that what we experts do, use other examples to see how they can be used to meet other problems?
March 13, 2011 at 5:30 pm
My problem wasn't with making a hierarchical query according to BOL... my problem is that the data the OP provided is cyclic and the examples in BOL don't show how to solve that particular problem. Well done on your part. Now, I have to see what's up with your code to find out how you defeated the cyclic nature of the data. 🙂
--Jeff Moden
Change is inevitable... Change for the better is not.
March 13, 2011 at 6:13 pm
Wup-up-OH!! I get it and learned something very new. The data is, in fact, a DAG but it sure didn't look like it. What's cool about all of this to me is, I see how a node can have multiple parents in the data without being cyclic.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 13, 2011 at 8:13 pm
Jeff Moden (3/13/2011)
My problem wasn't with making a hierarchical query according to BOL... my problem is that the data the OP provided is cyclic and the examples in BOL don't show how to solve that particular problem. Well done on your part. Now, I have to see what's up with your code to find out how you defeated the cyclic nature of the data. 🙂
Jeff, didn't say that that was your problem. I only stated that I used the example in the book to write the code I came up with to solve the problem, at least partially since the OP tweaked it a bit for the final solution.
Thinking about it, it may need some more tweaking if you think about it. What if a course has two prerequisites each at different levels. Wouldn't you want it listed after the highest prereq of the pair?
March 13, 2011 at 8:15 pm
I'm just having a bit of fun with this data because I never worked with double-parent data before. Here's the code I'm using. It's just a reformatted copy of Lynn's code with some modifications to create the Hierarchical Path (which actually explains a whole lot about what's going on).
--===== This is a GREAT example of how to do multiple parents in a hierarchy
-- without it being "cyclic". Now all I need to do is figure out how to
-- constrain it so it's NEVER "cyclic".
--Here's what the graph for this exercise looks like...
/*
/---\ /--- |601| |503|
\---/ \---/
| | |
| +-----+ |
| | |
/---\ /--- |605| |608|
\---/ \---/
| | |
| +-----+ |
| | |
/---\ /--- |606| |620|
\---/ \---/
| |
| +-------+
| |
/--- |607|
\---/
*/
USE TempDB
DROP TABLE dbo.Prereq, dbo.Course
GO
--===== A simple table to contain course number and name.
-- I changes all references to courseNo to be INT instead of VARCHAR(3)
CREATE TABLE dbo.Course
(
CourseNo INT NOT NULL,
CourseName VARCHAR(30) NOT NULL,
CONSTRAINT PK_Course PRIMARY KEY CLUSTERED (CourseNo)
)
;
--===== This forms the relation ships.
-- The PK ensures no duplicates but there's currently not a constraint
-- to prevent "cycles".
CREATE TABLE dbo.Prereq
(
NextCourseNo INT NOT NULL,
ReqCourseNo INT NOT NULL,
CONSTRAINT PK_Prereq PRIMARY KEY CLUSTERED (NextCourseNo, ReqCourseNo)
)
;
--===== Here's some additional constraints.
-- All they do is check to make sure that everything entered is in
-- the course table
ALTER TABLE dbo.Prereq
ADD CONSTRAINT FK_Prereq_Course_NextCourseNo
FOREIGN KEY(NextCourseNo)
REFERENCES dbo.Course (CourseNo)
;
ALTER TABLE dbo.Prereq
ADD CONSTRAINT FK_Prereq_Course_ReqCourseNo
FOREIGN KEY(ReqCourseNo)
REFERENCES dbo.Course (CourseNo)
;
--===== Ok, now populate the test tables with data
INSERT INTO dbo.Course
(CourseNo, CourseName)
SELECT 503, 'Sys Analysis and Design' UNION ALL
SELECT 601, 'C Programming' UNION ALL
SELECT 605, 'Database I' UNION ALL
SELECT 606, 'Database II' UNION ALL
SELECT 607, 'Database III' UNION ALL
SELECT 608, 'HCI' UNION ALL
SELECT 620, 'OOAD'
;
--===== This is the "relation" table.
-- Again, there are no constraints to prevent this from being "cyclic"
-- and I need to figure that out.
INSERT INTO dbo.Prereq
(NextCourseNo, ReqCourseNo)
SELECT 605,503 UNION ALL --Double Parent
SELECT 605,601 UNION ALL --Double Parent
SELECT 606,605 UNION ALL
SELECT 607,606 UNION ALL --Double Parent
SELECT 607,620 UNION ALL --Double Parent
SELECT 608,503 UNION ALL
SELECT 620,605 UNION ALL --Double Parent
SELECT 620,608 --Double Parent
;
--------------------------------------------------------------------------------------------------
WITH
cteCoursePrereqs AS
( --=== Think of this CTE as building the true adjacency list.
SELECT c.CourseName, c.CourseNo, p.ReqCourseNo, p.NextCourseNo
FROM dbo.Course c
LEFT OUTER JOIN dbo.Prereq p ON c.CourseNo = p.NextCourseNo
),
cteCourseLevels AS
( --=== The rest is simple if the data is "acyclic"... just do the same ol' recursive CTE that everyone does
-- to "layer" the hierarchy and dump it into a Hierarchical Path.
SELECT CourseName = CAST(CourseName AS VARCHAR(30)),
CourseNo,
ReqCourseNo,
--NextCourseNo,
CourseLevel = 1,
HPath = CAST(CourseNo AS VARCHAR(30))
FROM cteCoursePrereqs
WHERE NextCourseNo IS NULL
UNION ALL
SELECT CourseName = CAST(SPACE(cl.CourseLevel*4)+cp.CourseName AS VARCHAR(30)),
CourseNo = cp.CourseNo,
ReqCourseNo = cp.ReqCourseNo,
--NextCourseNo = cp.NextCourseNo,
CourseLevel = cl.CourseLevel + 1,
HPath = CAST(cl.HPATH + '\' + CAST(cp.CourseNo AS VARCHAR(30)) AS VARCHAR(30))
FROM cteCoursePrereqs cp
INNER JOIN cteCourseLevels cl ON cl.CourseNo = cp.ReqCourseNo
)
SELECT * FROM cteCourseLevels ORDER BY Hpath
Here's the output with the current test data...
CourseName CourseNo ReqCourseNo CourseLevel HPath
------------------------------ ----------- ----------- ----------- ------------------------------
Sys Analysis and Design 503 NULL 1 503
Database I 605 503 2 503\605
Database II 606 605 3 503\605\606
Database III 607 606 4 503\605\606\607
OOAD 620 605 3 503\605\620
Database III 607 620 4 503\605\620\607
HCI 608 503 2 503\608
OOAD 620 608 3 503\608\620
Database III 607 620 4 503\608\620\607
C Programming 601 NULL 1 601
Database I 605 601 2 601\605
Database II 606 605 3 601\605\606
Database III 607 606 4 601\605\606\607
OOAD 620 605 3 601\605\620
Database III 607 620 4 601\605\620\607
;
--Jeff Moden
Change is inevitable... Change for the better is not.
March 13, 2011 at 8:17 pm
Lynn Pettis (3/13/2011)
Jeff Moden (3/13/2011)
My problem wasn't with making a hierarchical query according to BOL... my problem is that the data the OP provided is cyclic and the examples in BOL don't show how to solve that particular problem. Well done on your part. Now, I have to see what's up with your code to find out how you defeated the cyclic nature of the data. 🙂Jeff, didn't say that that was your problem. I only stated that I used the example in the book to write the code I came up with to solve the problem, at least partially since the OP tweaked it a bit for the final solution.
Thinking about it, it may need some more tweaking if you think about it. What if a course has two prerequisites each at different levels. Wouldn't you want it listed after the highest prereq of the pair?
Heh... I didn't say you said that was my problem (although I can see how you may have taken it that way). 😉 I was just making a comment in conversation. BWAA-HAA!!! It must be all the Oracle you've had to write lately. 😛
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 1 through 15 (of 18 total)
You must be logged in to reply to this topic. Login to reply