July 30, 2009 at 3:32 am
Hello Friends,
I have fairly huge table having 1 million records. The table has a binary tree structure.
I want to write a procedure for getting all the children records for a particular parent record.
What will be more faster and efficient Guys. Recursive CTE or the use of temp table in recursive user defined Procedure???
Thanks
Bhavesh
July 30, 2009 at 7:14 am
bhavesh_183 (7/30/2009)
Hello Friends,I have fairly huge table having 1 million records. The table has a binary tree structure.
I want to write a procedure for getting all the children records for a particular parent record.
What will be more faster and efficient Guys. Recursive CTE or the use of temp table in recursive user defined Procedure???
Thanks
Bhavesh
You'll want to test it to be sure, but I'd say it would be the CTE. But there are alternatives that could be faster still. Do a search on the phrase "table of numbers" or "tally table." There are several solutions using this type of approach that can outperform recursive CTE's.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
July 30, 2009 at 7:50 am
I have a huge table with 1 million records with binary tree structure:
The simplified Table (tbl_tree1) with Binary Tree data is given below:
ID Name ParentID Age
1 A 0 21
2 B 1 25
3 C 1 25
4 D 2 26
5 E 2 27
6 F 3 30
7 G 3 19
8 H 4 18
9 I 4 24
10 J 5 37
11 K 5 26
12 L 6 24
13 M 6 28
14 N 7 21
15 O 7 20
I have written a procedure using recursive CTE to find out all the the children of a given parent ID. But the problem with this procedure is it takes 2 hr 10 min to retrieve details of the parentID having 107891 children. This code is given below:
ALTER PROCEDURE [dbo].[USP_BinaryTree]
(
@parentid INT
)
AS
BEGIN
SET NOCOUNT ON
BEGIN TRY
BEGIN
WITH Tree_CTE AS
(
SELECTID, Name, ParentID, Age
FROMtbl_tree1
WHEREParentID=@parentid
UNION ALL
SELECTt.ID, t.Name, t.ParentID, t.Age
FROMtbl_tree1 t
INNER JOIN Tree_CTE tcte ON tcte.id = t.parentID
)
SELECT * FROM Tree_CTE ORDER BY ID
END
END TRY
BEGIN CATCH
PRINT @@ERROR
END CATCH
END
Please can some one suggest more efficient and faster procedure coding for the same.
Thanks
Bhavesh
July 30, 2009 at 9:31 am
If the binary tree doesn't change very often (say, once a day), you can use a Nested Set model in parallel with your current Adjacency Model. Joe Celko has some pretty good articles on the Nested Set thing... it makes for SET BASED lookups for both the downline and upline that are faster than most can believe is possible with hierarchical data. I recommend you Google for "Nested Set Joe Celko".
--Jeff Moden
Change is inevitable... Change for the better is not.
July 30, 2009 at 10:28 am
Thanks Jeff. I wasn't sure whether or not to suggest the nested set approach. I don't have any experience with it, but I knew it was a possibility.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
July 30, 2009 at 11:45 am
Binary trees are also particularily good candidates for the "Path-Only" model.
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
July 30, 2009 at 11:48 am
Here's a link...
http://www.intelligententerprise.com/001020/celko.jhtml
Just be a little careful... one of the many articles on the Web for Nested Sets has two things to be slightly concerned with and I don't remember which one... one of them wipes out the adjacency model table as it builds the nested set and the other one needs a minor patch so that it will properly include the final node.
I have used Nested Set models in parallel with an Adjacency Model. I did them in parallel because the data didn't change much, Adjacency Models are (somehow) easier for humans to think about, and the incredible speed advantages of the Nested Set Model are just incredible. And when I say I did them in parallel, I mean in the same table... I added the required left and right "bowers" (that's what I call them... old Navy term for a particular type of anchor) to the adjacency model table.
--Jeff Moden
Change is inevitable... Change for the better is not.
July 30, 2009 at 12:03 pm
Here's a naive implementation of the Path-Only model with your data that demonstrates it's advantages:
create table Binary_tree (PathKey Varchar(30) primary key,[Name] varchar(20),Age int);
INSERT into binary_tree (PathKey, Name, Age)
SELECT '','A',21
UNION SELECT '0','B',25
UNION SELECT '1','C',25
UNION SELECT '00','D',26
UNION SELECT '01','E',27
UNION SELECT '10','F',30
UNION SELECT '11','G',19
UNION SELECT '000','H',18
UNION SELECT '001','I',24
UNION SELECT '010','J',37
UNION SELECT '011','K',26
UNION SELECT '100','L',24
UNION SELECT '101','M',28
UNION SELECT '110','N',21
UNION SELECT '111','O',20
--topologically sorted:
Select * From binary_tree order by PathKey
--all children of a node:
Select * From binary_tree Where PathKey Like '1%'
It's principle disadvantage is the size of the path key. If that becomes an issue, then there are ways (admittedly, obtuse ways) to compact it.
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
August 3, 2009 at 1:45 pm
There seems to be a performance threshold about 13 levels deep.
See http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=129754
N 56°04'39.16"
E 12°55'05.25"
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply