August 26, 2009 at 10:47 pm
Comments posted to this topic are about the item SQL Server 2008 T-SQL Enhancements Part - III
August 27, 2009 at 4:17 am
great article....thank you very much.....the simplicity of the language is greatly appreciated...do come up with more of such articles...thank you
August 27, 2009 at 7:17 am
Great article! Thanks!
Little hint for .NET developers:
New .NET wrapper types SqlHierarcyId, SqlGeography, SqlGeometry are not stored in System.Data.dll but in "Microsoft.SqlServer.Types.dll" in "c:\Program Files\Microsoft SQL Server\100\SDK\Assemblies\" directory. After adding the reference to your project you can find them in namespace "Microsoft.SqlServer.Types".
Greets
Flo
August 27, 2009 at 7:24 am
Thanks for the article, makes me look forward to our upcoming migration from 2000 to 2008.
However, I have to ask why you would actually try to use recursion for the 2000 example? Especially with the well-known issues around nesting, why provide such a solution? When I need to do hierarchies, I use something more like this:
create procedure EmployeeHierarchy
@EmpID int
as
declare @out table (EmpID int, EmpName varchar(255), ReportsTo int)
insert into @out
selectEmpID, EmpName, ReportsTo
fromEmployees_2000_2005
whereEmpID = @EmpID
while @@rowcount > 0
insert into @out
selectEmpID, EmpName, ReportsTo
fromEmployees_2000_2005
whereReportsTo in (select EmpID from @out)
and EmpID not in (select EmpID from @out)
select o.EmpID, o.EmpName, e.EmpName Manager
from @out o
inner joinEmployees_2000_2005 e
ono.ReportsTo = e.EmpID
order by 1
go
A routine like this self-limits, so no infinite loops in cases when somebody messes up the list. No problems with recursion levels either. If the @out table gets too big, convert it to a temp table with indexes.
Or am I just missing something?
Again, thanks for the article!
August 27, 2009 at 8:56 am
Arshad Ali,
Thank you for an excellent, well-written article! (Note: the code to populate your sample table is missing the declares for @CTO, etc.)
gerald.drouin,
Clever solution!
August 27, 2009 at 9:49 am
I wasn't aware of this addition to SQL 2008 until I read your article. It's interesting but somewhat kludgey.
I played around with hierarchyid after reading this article and discovered it can be surprising easy to mess up records. Specifically, the GetReparentedValue doesn't seem to be very useful if you are moving children from one parent to another parent that already has children. It will actually assign duplicate hierarchyid's.
Here's an example of it happening.
USE [Test]
GO
/****** Object: Table [dbo].[hier_test1] Script Date: 08/27/2009 11:14:30 ******/
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[hier_test1]') AND type in (N'U'))
DROP TABLE [dbo].[hier_test1]
GO
USE [Test]
GO
/****** Object: Table [dbo].[hier_test1] Script Date: 08/27/2009 11:14:30 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[hier_test1](
[record_id] [int] IDENTITY(1,1) NOT NULL,
[data1] [nvarchar](50) NULL,
[data2] [nvarchar](2000) NULL,
[hier_id] [hierarchyid] NOT NULL,
[position] [nvarchar](50) NULL,
CONSTRAINT [PK_hier_test1] PRIMARY KEY CLUSTERED
(
[record_id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
insert into Test.dbo.hier_test1(
data1,
data2,
hier_id
)VALUES(
'/',
'This is the root record',
hierarchyid::GetRoot()
)
exec AddHierRecord 1, '/1', 'Child of Root'
exec AddHierRecord 1, '/2', 'Child of Root'
exec AddHierRecord 1, '/3', 'Child of Root'
exec AddHierRecord 2, '/1/1', 'Child of Child 1'
exec AddHierRecord 2, '/1/2', 'Child of Child 1'
exec AddHierRecord 2, '/1/3', 'Child of Child 1'
exec AddHierRecord 3, '/2/1', 'Child of Child 2'
exec AddHierRecord 3, '/2/2', 'Child of Child 2'
exec AddHierRecord 3, '/2/3', 'Child of Child 2'
exec AddHierRecord 4, '/3/1', 'Child of Child 3'
exec AddHierRecord 4, '/3/2', 'Child of Child 3'
exec AddHierRecord 4, '/3/3', 'Child of Child 3'
exec AddHierRecord 5, '/1/1/1', 'Child of Child 1 of Child 1'
exec AddHierRecord 5, '/1/1/2', 'Child of Child 1 of Child 1'
exec AddHierRecord 5, '/1/1/3', 'Child of Child 1 of Child 1'
exec AddHierRecord 6, '/1/2/1', 'Child of Child 2 of Child 1'
exec AddHierRecord 6, '/1/2/2', 'Child of Child 2 of Child 1'
exec AddHierRecord 6, '/1/2/3', 'Child of Child 2 of Child 1'
select *, hier_id.ToString()
fromhier_test1
declare @oldparent hierarchyid
declare @newparent hierarchyid
select @oldparent = hier_id
fromhier_test1
whererecord_id = 2
select @newparent = hier_id
fromhier_test1
whererecord_id = 3
update hier_test1
sethier_id = hier_id.GetReparentedValue(@oldparent, @newparent)
wherehier_id.IsDescendantOf(@oldparent) = 1
and record_id 2
select *, hier_id.ToString()
fromhier_test1
Look at record_id's 5 and 8 in the second set of results. Then there's the issue of the "grandchildren" records. It does attempt to move these. When it does, however, it becomes impossible to distinguish which of two parents they belong to.
I've gone through the tutorials and documentation on TechNet and they don't offer any examples of moving a branch into an already populated branch.
If anyone can tell me I'm doing something wrong I'd really like to know what it is so I can make use of this new feature.
August 27, 2009 at 10:58 am
Thanks Carla,
Yeah I missed to include declare statement, sorry for inconvenience.
Reader,
Kindly add these declare statement in script existing in first table of the article.
[font="Courier New"]DECLARE @CTO INT
DECLARE @DevManager INT
DECLARE @TESTManager INT
DECLARE @DevLead1 INT
DECLARE @DevLead2 INT
DECLARE @TESTLead INT[/font]
September 1, 2009 at 10:14 am
I really thought I'd love the use of the hierarchyid type... but things are slipping for me.. sorry if I am hijacking this a bit, but it's relevant.
I am finding it quicker to do a flattened view using ParentId -> Id than Node.GetAncestor(1) = Node
Is there a reason? :blink:
I've got a million rows in a catalgoue table that looks like this:
CREATE TABLE [dbo].[tblHierarchy](
[Id] [int] IDENTITY(1,1) NOT NULL,
[ParentId] [int] NULL,
[Node] [hierarchyid] NOT NULL,
[ParentNode] AS ([Node].[GetAncestor]((1))) PERSISTED,
[NodeLevel] AS ([Node].[GetLevel]()) PERSISTED,
[Code] [nvarchar](64) NOT NULL,
[TypeId] [int] NOT NULL,
[TargetId] [int] NULL,
[CreateDate] [datetime] NOT NULL,
[CreateUser] [nvarchar](32) NOT NULL,
[UpdateDate] [datetime] NOT NULL,
[UpdateUser] [nvarchar](32) NOT NULL,
[UpdateCode] [nchar](1) NOT NULL,
CONSTRAINT [tblHierarchy_PK] PRIMARY KEY NONCLUSTERED
(
[Id] ASC
) ON [PRIMARY],
CONSTRAINT [tblHierarchy_AK1] UNIQUE CLUSTERED
(
[Node] ASC
) ON [PRIMARY]
) ON [PRIMARY]
Simples... I've had lots of different indexes on, but I've stripped them off to get back to basics.
The tree I am querying has 21000 records.
If I do the following:
SELECT
[H1Id] = h1.Id
,[H1Code] = H1.Code
,[H1Type] = H1.TypeId
,[H2Grp] = H2.Id
,[H2Code] = H2.Code
,[H2Type] = H2.TypeId
,[H3Grp] = H3.Id
,[H3Code] = H3.Code
,[H3Type] = H3.TypeId
,[H4Grp] = H4.Id
,[H4Code] = H4.Code
,[H4Type] = H4.TypeId
FROM dbo.tblHierarchy H0 WITH (NOLOCK)
INNER JOIN dbo.tblCATHierarchy H1 WITH (NOLOCK)
ON H0.Id = H1.ParentId
INNER JOIN dbo.tblCATHierarchy H2 WITH (NOLOCK)
ON H1.Id = H2.ParentId
INNER JOIN dbo.tblCATHierarchy H2 WITH (NOLOCK)
ON H2.Id = H3.ParentId
INNER JOIN dbo.tblCATHierarchy H4 WITH (NOLOCK)
ON H3.Id = H4.ParentId
WHERE H0.Code = 'ROOT'
AND H1.TypeId = 1
Compared to:
SELECT
[H1Id] = h1.Id
,[H1Code] = H1.Code
,[H1Type] = H1.TypeId
,[H2Grp] = H2.Id
,[H2Code] = H2.Code
,[H2Type] = H2.TypeId
,[H3Grp] = H3.Id
,[H3Code] = H3.Code
,[H3Type] = H3.TypeId
,[H4Grp] = H4.Id
,[H4Code] = H4.Code
,[H4Type] = H4.TypeId
FROM dbo.tblHierarchy H0 WITH (NOLOCK)
INNER JOIN dbo.tblCATHierarchy H1 WITH (NOLOCK)
ON H0.Node = H1.Node.GetAncestor(1)
INNER JOIN dbo.tblCATHierarchy H2 WITH (NOLOCK)
ON H1.Node = H2.Node.GetAncestor(1)
INNER JOIN dbo.tblCATHierarchy H2 WITH (NOLOCK)
ON H2.Node = H3.Node.GetAncestor(1)
INNER JOIN dbo.tblCATHierarchy H4 WITH (NOLOCK)
ON H3.Node = H4.Node.GetAncestor(1)
WHERE H0.Code = 'ROOT'
AND H1.TypeId = 1
The first query runs shed twice as fast. Even if I substitute Node.GetAncestor(1) with my persisted computed column of ParentNode.
It's important for the dataset to be flattened for exporting to other systems. If someone reading this has better plans please reply or PM me. would be grateful as always.
I guess we may need both approaches? It's incredibly powerful, but it has it's limits?
Kev
February 17, 2010 at 5:10 pm
after playing with hierarchyid and mainly it's GetReparentedValue feature i have to say I am disappointed.
As well as the problem mentioned above by kent.kester I found that using GetReparentedValue to move a record will not also move the record's children, i thought that was the whole point, why would I want orphan records in my table?
This certainly feels like a work in progress by Microsoft and hopefully SS2008 r2 will have some improvements.
March 19, 2010 at 1:29 am
@gerald.drouin,
While loops do offer a solution for hierarchy based solution, but CTEs are simpler, cleaner way of solving the recursive problems. If correctly used, they offer better performance than while loop based solution.
Relevant read can be
http://strictlysql.blogspot.com/2010/01/recursive-cte-vs-temp-table-performance.html"> http://strictlysql.blogspot.com/2010/01/recursive-cte-vs-temp-table-performance.html
Regards,
Raj
March 19, 2010 at 6:41 am
I tend to roll-my-own when it comes to Hierarchies.
Here's how I do it.
Say I have a list of Employees with a Key on ID
I then make a Employee_Hierarchy table then an Employee_SuperHierarchy table
Employee_Hierarchy contains the parent and child Employee Primary Keys and is a compound unique index.
Employee_SuperHierarchy contains the parent and child Employee Primary key and the Gap.
For every Employee row there's an Row in with Gap=0
It contains the expanded hierarchy so if (A) is the boss of (B) and (B) is the boss of (C) then
A, A, 0
B, B, 0
C, C, 0
A, B, 1
B, C, 1
A, C, 2
You can then loop until no more are inserted to build the full hierarchy by
INSERT INTO Employee_SuperHierarchy
(Parent, Child, Gap)
SELECT Parent.ParentID
, Child.ChildID
, Child.Gap + 1
FROM Employee_SuperHierarchy Child
INNER JOIN Employee_Hierarchy Parent
ON Child.ParentID = Parent.ChildID
Then you can easily run a query to find all the employees (or Bosses) of a person. Do make sure to exclude Gap = 0 (employs self/Self boss)
E.g. To find all the employees
SELECT Employee.*
FROM Employee Boss
INNER JOIN Employee_SuperHierarchy
ON Employee_SuperHierarchy.ParentID = Boss.ID
INNER JOIN Employee
ON Employee.ID = Employee_SuperHierarchy.ChildID
WHERE Boss.Name = 'Stalin'
March 19, 2010 at 8:08 am
Good article
I used recursion in in 2000 and CTE in 2005, and always had the thought that while my solution seemed clever, the aftermath of dealing with it was always a pain. Moving nodes and children and/or getting all children direct and indirect or all ancestors direct or indirect etc. The implementation in 2008/R2, while not perfect has come a long way. I didn't realize it was implemented via CLR until reading the article, but that makes sense as it is calling methods and such.
I rated the article 5 stars, but my one complaint is that you squeezed UDT's into the picture after an already full article. UDT's might not fit by themselves into an article, but I think the Hierarchy implementation was quite enough for this article. I know, picky, picky, but that just my 2 cents worth of constructive criticism.
Regards,
Toby
March 19, 2010 at 11:37 am
hi arshad,
we are already using the same logic you explained for sql server 2005.
we don't have sql server 2000, so we don't have to call a stored procedure recursively.
we have several sorts of multilevel network marketing implementation.
some companies which use similar business logic are amwa_ and for_v_rlivin_
this can be viewed as layers or tiers.
the whole hierarchy can be called trees, stars or simply networks.
in one of our cases, we are limited to a table with columns parent, tier2, tier3, ..., tier10.
i guess, i would need to rewrite ddl for this table with hierarchyid, but later this change will imply changing a lot sql logic already written. this can be done because this client has sql server 2008.
it looks simple, but any tiers, including parent column, can have multiple rows, and of course some tiers are plain null, except by parent column.
we have successfully drawn trees from these tables, for which the hardest case is this table with columns parent, tier2, tier3, ..., tier10.
one of these trees is in a flex front-end, and the other is an asp.net treeview as gui.
your article has given me plenty of insight about tree structure business logic, and i stopped by to say thank you.
my guess is that hierachyid is in our short-term future for database programming.
keep up your good work!!
best regards,
tonci korsano
March 19, 2010 at 11:56 am
Nice article thanks. I like the compare and contrast between 2k5 and 2k8 that you demonstrated.
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
March 29, 2010 at 10:42 pm
Trying to see how hierarchyId beats storing a "path"-style key, and it's not coming to me: a string of delimited (possibly tagged) local names, like a file path, can be searched using "LIKE" in ways that are much like XPath or LDAP queries. With prefix compression in indices, you don't pay a heavy price in Btree depth on that; and clustering works nicely, too. What am I missing, here?
Viewing 15 posts - 1 through 14 (of 14 total)
You must be logged in to reply to this topic. Login to reply
This website stores cookies on your computer.
These cookies are used to improve your website experience and provide more personalized services to you, both on this website and through other media.
To find out more about the cookies we use, see our Privacy Policy