October 2, 2011 at 11:09 am
I heard tales of performance problems associated with the HierarchyID datatype in SQL Server 2008. Since I haven't actually used the HierarchyID datatype (and, therefore, haven't used any of it methods up 'till now), I built myself a nice "little" million row "clean" table in the form of an Adjacency List (including an extra column called "Sales") and converted it to use the HierarchyID data-type. I used the indexes that Microsoft Suggested for such a thing.
Then, I built several queries just to try my hand at it (tried them on a smaller 14 node Hierarchy first, just to make sure things worked correctly). All of these queries are based on what I've been able to Google as "hierarchyid performance problems".
1. Find all Descendants (all employees in the "down-line" tree) for a given node.
2. Find all Ancestors (all managers in the "up-line" chain) for a given node.
3. Find all Siblings (all nodes at the same level) for a given node.
4. Find the SUM of sales for all Descendants 7 levels "down" for each of the million nodes in the entire hierarchy and insert into a new table.
Although I'm sure I could tweak a couple of things here and there insofar as indexing goes, I'm just not seeing what I would call either a performance problem or a resource usage problem.
So, my questions are... has anyone actually experienced a performance problem using the HierarchyID data-type and associated methods? If so, could you describe what you were doing and, perhaps, even post some code that demonstrates the problem?
Thanks for the help, folks.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 2, 2011 at 11:34 am
Jeff Moden (10/2/2011)
I.... Since I haven't actually used the HierarchyID datatype (and, therefore, haven't used any of it methods up 'till now), I built myself a nice "little" million row "clean" table in the form of an Adjacency List (including an extra column called "Sales") and converted it to use the HierarchyID data-type. I used the indexes that Microsoft Suggested for such a thing.
Hi Jeff
I have not used this data type before...but am keen to learn more.
would you care to share your "little" million row test rig please?
regards jls
________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day
October 2, 2011 at 12:20 pm
Sure... the following code creates a million row "clean" hierarchy in the form of a classic Adjacency List. I have it as a stored procedure so that I can just call it with the number of rows I want the hierarchy to contain for different experiments...
CREATE PROCEDURE dbo.BuildTestHierarchy
/**********************************************************************************************************************
Create a randomized "clean" hierarchy. Each EmployeeID (except the first one, of course) is assigned a random
ManagerID number which is always less than the current EmployeeID. This code runs nasty fast and is great for
testing hierarchical processing code.
Usage: (both examples build a million row Adjacency List Hierarchy)
EXEC dbo.BuildTestHierarchy 1000000
Revision History:
Rev 00 - 28 Apr 2010 - Jeff Moden - Initial creation and test.
Rev 01 - 15 May 2010 - Jeff Moden - Abort if current DB isn't "tempdb" to protect users that want to "play".
**********************************************************************************************************************/
--===== Declare the I/O parameters
@pRowsToBuild INT
AS
--===== Make sure that we're in a safe place to run this...
IF DB_NAME() <> N'tempdb'
BEGIN
RAISERROR('Current DB is NOT tempdb. Run aborted.',11,1);
RETURN;
END
;
--===== Conditionaly drop the test table so we can do reruns more easily
IF OBJECT_ID('TempDB.dbo.Employee','U') IS NOT NULL
DROP TABLE TempDB.dbo.Employee
;
--===== Build the test table and populate it on the fly.
-- Everything except ManagerID is populated here.
SELECT TOP (@pRowsToBuild)
ISNULL(ROW_NUMBER() OVER (ORDER BY (SELECT 1)),0) AS EmployeeID,
CAST(0 AS INT) AS ManagerID,
CAST(NEWID() AS VARCHAR(36)) AS EmployeeName,
(ABS(CHECKSUM(NEWID()))%12+1)*1000 AS Sales
INTO TempDB.dbo.Employee
FROM Master.sys.All_Columns ac1
CROSS JOIN Master.sys.All_Columns ac2
;
--===== Update the test table with ManagerID's. The ManagerID is some random value which is always less than the
-- current EmployeeID to keep the hierarchy "clean" and free from "loop backs".
UPDATE TempDB.dbo.Employee
SET ManagerID = CASE
WHEN EmployeeID > 1
THEN ABS(CHECKSUM(NEWID())) % (EmployeeID-1) +1
ELSE NULL
END
;
--===== Add some indexes that most folks would like have on such a table
ALTER TABLE TempDB.dbo.Employee ADD CONSTRAINT PK_Employee PRIMARY KEY CLUSTERED (EmployeeID);
CREATE INDEX IX_Employee_ManagerID ON TempDB.dbo.Employee (ManagerID);
--Jeff Moden
Change is inevitable... Change for the better is not.
October 2, 2011 at 2:12 pm
You posted a possible answer to your own question more than a year ago, so maybe you forgot.
http://www.sqlservercentral.com/Forums/FindPost922664.aspx
Unfortunately, the link to connect.microsoft.com results in an unexpected system error now. Maybe you remember what the item was about.
October 2, 2011 at 2:40 pm
Thanks, Peter. Ironically, that was one of the "dead ends" I ran across in my search for performance problems and I'm definitely not seeing those types of problems in the testing I've set up, so far.
No... I don't remember what that particular Connect Item was about, either. MS is correct about CLRs not being "seen" by the Optimizer and that the problem they mentioned could definitely be a problem with any CLR, but I'm just not seeing such problems with HierarchyID code since I started playing with it yesterday. Maybe they fixed it so it wouldn't be such a problem anymore.
Anyway, if anyone has an example of were the use of the HierarchyID data-type and related methods is a performance problem, I'd sure like to see it.
Thanks again.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 3, 2011 at 7:50 pm
Gosh folks... based on the low number of responses on this thead, either not a whole lot of people are using the HierarchyID or not a whole lot of people are having problems with its performance.
Maybe I should change the question a bit... are any of you using the HierarchyID data type?
--Jeff Moden
Change is inevitable... Change for the better is not.
October 3, 2011 at 11:25 pm
Jeff Moden (10/3/2011)
Maybe I should change the question a bit... are any of you using the HierarchyID data type?
No.
maybe useful in the future when dealing with 3rd pty dbs.
________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day
October 4, 2011 at 8:01 am
Heh... I figured that out from your post where you said you never used it before. 🙂 I was hoping some of the other denizens of SSC would jump in. Folks are normally pretty talkative on SSC and I'm amazed at the virtual silence this subject has drawn.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 4, 2011 at 11:48 am
I think that may be further proof that few people has utilized this...nobody want to speak on a topic that they don't know about. I am hoping I can find some time soon to look through the test you have so far. I know I am going to have a project in the next few months where this could be really useful. In my case I would need more of a forest (multiple roots) but I may be able to use the hierarchyID as part of the solution.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
October 4, 2011 at 12:15 pm
Jeff Moden (10/4/2011)
Heh... I figured that out from your post where you said you never used it before. 🙂 I was hoping some of the other denizens of SSC would jump in. Folks are normally pretty talkative on SSC and I'm amazed at the virtual silence this subject has drawn.
having read up a bit more on the subject...I can see possible benefits for such things as BOMs.
Would be interested to see if anyone out there is using it for BOMs.
At the moment I have no requirement for BOMS or complex Org charts ....so for now, its interesting but just not applicable (I think !!)
________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day
October 4, 2011 at 1:54 pm
We are now just evaluating its use in our shop. We would not have big data to work with though.
Thanks...Chris
October 4, 2011 at 2:01 pm
Sean Lange (10/4/2011)
I think that may be further proof that few people has utilized this...nobody want to speak on a topic that they don't know about. I am hoping I can find some time soon to look through the test you have so far. I know I am going to have a project in the next few months where this could be really useful. In my case I would need more of a forest (multiple roots) but I may be able to use the hierarchyID as part of the solution.
That's exactly what I'm thinking, Sean.
Shifting gears, I've done some rather extensive work with "forests of trees" using Nested Sets. I can't say what we're using it for because it's a proprietary process that might be patentable. If you can say, what would you be using it for? I'm asking because I might be able to help with some of the information I've developed for the article I'm slowly putting together on the subject of Hierarchical queries.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 4, 2011 at 2:02 pm
CGSJohnson (10/4/2011)
We are now just evaluating its use in our shop. We would not have big data to work with though.Thanks...Chris
Thanks for the feedback, Chris. I realize it's a bit off subject for this thread but what would you end up using hierarchical structures for?
--Jeff Moden
Change is inevitable... Change for the better is not.
October 4, 2011 at 2:08 pm
Hi, Jeff. Sorry, but I cannot talk about it all. However, one use would be for the common goal of organizational structure.
Thanks...Chris
October 4, 2011 at 2:08 pm
Jeff Moden (10/4/2011)
Sean Lange (10/4/2011)
I think that may be further proof that few people has utilized this...nobody want to speak on a topic that they don't know about. I am hoping I can find some time soon to look through the test you have so far. I know I am going to have a project in the next few months where this could be really useful. In my case I would need more of a forest (multiple roots) but I may be able to use the hierarchyID as part of the solution.That's exactly what I'm thinking, Sean.
Shifting gears, I've done some rather extensive work with "forests of trees" using Nested Sets. I can't say what we're using it for because it's a proprietary process that might be patentable. If you can say, what would you be using it for? I'm asking because I might be able to help with some of the information I've developed for the article I'm slowly putting together on the subject of Hierarchical queries.
I have kind of wondered if the nested set would be easier/better for my situation. I am not doing it for anything proprietary by any stretch of the imagination. I am going to be completely rebuilding our menu system for one of our websites. The original knuckleheads that "designed" should be shot. I am thinking that the forest of trees would be perfect for this. It will not likely get very "deep" but the second and third levels can get fairly large as this is a ecomm site for about a half million skus. I can imagine how cool it would be if I could move whole "branches" from one tree to another, or at least copy them from one to the other.
I could of course do this pretty simple with an adjaceny list but what fun is recreating a technique that is already way out of date when there are better ways of accomplishing this?
I would be thrilled to help in anyway that I can.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
Viewing 15 posts - 1 through 15 (of 31 total)
You must be logged in to reply to this topic. Login to reply