June 4, 2009 at 10:00 am
lxz20 (6/4/2009)
Yep, NodeIDs are repeated here as examples to show infinite loops in 0, 1, or 2 steps. These loops are actually easier to catch and removed. There are other instances where a node is NOT looped back to itself until like 7 or 10 steps later. I would like a non-cursor way to list / remove all those n step loops, if possible, for performance reason. Currently I have to use a cursor, going through row by row following the following pseudo-code. The performance is NOT good because I have to use 2 loops:Put all data into a cursor
Begin cursor loop
Fetch a row into a temp table
Inner While Loop (while there is a new parent)
Follow the chain to get a new parent row
If parentNodeID = nodeID in temp table
Infinite Loop -- do something here
Break Inner While Loop
Else
Insert parent row into the same temp table
End Inner While Loop
End corsor loop
Try this
WITH UniqueNodeItem(NodeID, ParentNodeID) AS
(
SELECT NodeID,MAX(ParentNodeID)
FROM NodeItem
GROUP BY NodeID
HAVING COUNT(*)=1),
HierarchyTree (NodeID, ParentNodeID, Step) AS
(
SELECT NodeID, ParentNodeID, 1
FROM UniqueNodeItem
WHERE ParentNodeID NOT IN (SELECT NodeID FROM UniqueNodeItem) OR ParentNodeID IS NULL
UNION ALL
SELECT r.NodeID, r.ParentNodeID, t.Step + 1
FROM UniqueNodeItem r
inner join HierarchyTree t on r.ParentNodeID = t.NodeID
)
SELECT NodeID
FROM NodeItem
WHERE NodeID NOT IN (SELECT NodeID FROM HierarchyTree)
____________________________________________________
Deja View - The strange feeling that somewhere, sometime you've optimised this query before
How to get the best help on a forum
http://www.sqlservercentral.com/articles/Best+Practices/61537June 4, 2009 at 1:03 pm
Mark,
Excellent solution. It works!
June 4, 2009 at 2:15 pm
One thing I always recommend on this kind of hierarchy, is have a "TopNodeID" type column. All levels in the same hierarchy have the same entry in that column. Speeds them up immensely. Makes it easy to prevent/fix these type of corruption issues too, since you can join on the TopNodeID column and find the loops right there. But since you don't have that, it doesn't help in this situation.
What we did in our one and only hierarchichal-type of table (a stock exchange sector/index hierarchy with lots of roots per stock exchange) is that we have a separate table that identifies the root nodes. On top of that the INSERT/UPDATE Triggers prevent this kind of situation ever occurring. Yeah, triggers are a pain in the butt, they are however bloody helpful in this case of problem.
Won't help with your existing data mess though...
June 4, 2009 at 2:23 pm
GSquared (6/4/2009)
You could potentially avoid a cursor in this. But it'll take more work and be less efficient.Cursors do have their uses. Limited, yes, but uses nonetheless.
Especially in a case like this. A once-off clean-up where you actually really have to think procedurally in order to wrap your head around the solution for solving a problem isn't something that needs to concern itself with performance issues. Unless of course the hierarchies regularly get stuffed up. But then the issue is elsewhere, i.e. you must find the culprit and fix that one. Not meaning that you shouldn't put safeguards in place to protect your data.
Viewing 4 posts - 16 through 18 (of 18 total)
You must be logged in to reply to this topic. Login to reply