August 30, 2011 at 6:12 pm
Sean Lange (8/29/2011)
Jeff do you have any experience using Nested Sets with multiple root level elements? for example in a menu system. I have used it in an org chart type of implementation many times but have never been able to figure out to make it work with multiple root elements.
You mean a "forest"? Sure. They're really no different than managing an individual org chart. BOM's can get a bit more complicated if that's the type of "forest" you're thinking of.
--Jeff Moden
Change is inevitable... Change for the better is not.
August 30, 2011 at 10:11 pm
I, for one, would be very interested to see a nested set implementation explained. I've read a lot about it and think it looks very promising, but I so far refrained from using the technique because of the unpredictable (performance) behaviour when modifying the tree elements.
September 7, 2011 at 6:30 pm
R.P.Rozema (8/30/2011)
I, for one, would be very interested to see a nested set implementation explained. I've read a lot about it and think it looks very promising, but I so far refrained from using the technique because of the unpredictable (performance) behaviour when modifying the tree elements.
I have a couple of presentations I'm working on for an SQL Saturday coming up. I'll try to get back to the hierarchy article after that. And, no... I don't use a permanent nested set... it's easier to make changes a different way and then regen the nested set. Darned near as fast as just making changes to it, as well.
--Jeff Moden
Change is inevitable... Change for the better is not.
September 23, 2011 at 4:49 am
Hi Jeff, did you find some time to work on some examples for us on nested sets?
September 24, 2011 at 5:52 pm
R.P.Rozema (9/23/2011)
Hi Jeff, did you find some time to work on some examples for us on nested sets?
It would appear that Mr. Celko has some examples at the ready and was kind enough to Cut'n'Paste a few of them.
--Jeff Moden
Change is inevitable... Change for the better is not.
September 25, 2011 at 2:28 am
Assuming I want to store multiple, independent sets in one table. Say, following Joe's example, I need to store the organisational hierarchy for multiple companies. My feeling says I should create a model in which the set elements have an additional compound key element identifying the root set (i.e. the company), or is there a reason why I should keep the model as shown in the example and assign the root node of -independent- set N a lft value of 1 + (N-1).rgt ?
So should I change the OrgChart table to OrgCharts like this?CREATE TABLE OrgCharts (
company_id INT NOT NULL,
emp_name CHAR(10) NOT NULL,
lft INTEGER NOT NULL UNIQUE CHECK (lft > 0),
rgt INTEGER NOT NULL UNIQUE CHECK (rgt > 1),
CONSTRAINT order_okay CHECK (lft < rgt),
CONSTRAINT pk_OrgCharts PRIMARY KEY (company_id, emp_name)
);
September 25, 2011 at 12:22 pm
First, I wouldn't store the Employee Name in the table except for demonstration purposes and Joe does cover that in the book I cited. Keep the "functions" of "organization" separate from "employee information". What should be in the organizational chart are "positions" instead of "people". That way, it's super easy to have one person with many bosses or responsibilites without creating a "cycle" that would destroy most of the functionality that people look for in Org Charts.
Slightly off track, I don't recommend very many books but, if you're planning on working with Hierarchies a lot, the book I cited that Joe wrote is a wealth of knowledge on the subject. He also cites one of Ben-Gan's books which has a killer chapter on the subject, as well. Do be advised that Joe wrote most of the code using "ANSI Standard" code and will absolutely require modification to get most of the examples to work in T-SQL. For example, the concatenation operator he uses is "||" and you'll need to convert all of those to "+". There are other nuances but the book is definitely worth having especially if you're don't consider yourself to be an expert on the subject of hierarchies.
Getting back on track and with the understanding that I can't see your data to make a 100% accurate guess, I'd probably form the hierarchical table something like the following:
CREATE TABLE dbo.CompanyOrganization
(
CompanyID SMALLINT NOT NULL,
PositionID INT NOT NULL,
EmployeeID INT NOT NULL,
LeftBower INT NOT NULL,
RightBower INT NOT NULL,
AbsoluteLevel SMALLINT NOT NULL,
CONSTRAINT AK_CompanyOrganization_CompanyID_LeftBower_RightBower UNIQUE CLUSTERED (CompanyID, LeftBower, RightBower),
CONSTRAINT PK_CompanyOrganization_CompanyID_PositionID PRIMARY KEY NONCLUSTERED (CompanyID, PositionID),
CONSTRAINT AK_CompanyOrganization_CompanyID_LeftBower UNIQUE (CompanyID, LeftBower),
CONSTRAINT AK_CompanyOrganization_CompanyID_RightBower UNIQUE (CompanyID, RightBower),
CONSTRAINT CK_CompanyOrganization_CompanyID_GT0 CHECK (CompanyID > 0),
CONSTRAINT CK_CompanyOrganization_PositionID_GT0 CHECK (PositionID > 0),
CONSTRAINT CK_CompanyOrganization_EmployeeID_GE0 CHECK (EmployeeID >= 0), --0 = "Vacant"
CONSTRAINT CK_CompanyOrganization_LeftBower_GT0 CHECK (LeftBower > 0),
CONSTRAINT CK_CompanyOrganization_RightBower_GT1 CHECK (RightBower > 1),
CONSTRAINT CK_CompanyOrganization_AbsoluteLevel_GT0 CHECK (AbsoluteLevel > 0),
CONSTRAINT CK_CompanyOrganization_BowerOrderOK CHECK (LeftBower < RightBower)
)
;
CREATE NONCLUSTERED INDEX IX_CompanyOrganization_CompanyID_EmployeeID
ON dbo.CompanyOrganization (CompanyID, EmployeeID) --Contains left and right bowers because of CI
;
Heh... Yeah... I like to name everything myself. 😛
Of course, I've not included any of the necesssary FK's... I'll let you do that. Joe will probably also do two things. First, he'll yell about using "IDs" and then he'll likely be able to add a constraint or two that I probably missed.
As a side bar, the "Position" table will, of course, contain the postions available in the company. That table should also have a reference to a "JobDescription" table. The other tables are pretty much self-explanatory.
Something else about the table above... I have the constraints setup so that each company has its own set of bowers starting at 1. That's for IF you want to put all of the companies in the same table. My recommendation is that you don't because it'll make controlling who can see what a little bit more difficult. Of course, having all the companies in one table will eliminate the need for code duplication or dynamic SQL but that also eliminates easy customization of code for each company. Of course, those are just my opinions... the choice is yours. If you do decide to go with multiple tables, we'll need to make some minor changes in the constraints, indexes, and remove the CompanyID column. In fact, many of the constraints might just be able to simply go away.
One more side-bar... Joe has code to do moves, adds, and deletes in the Nested Sets. I typically maintain all 3 hierarical structures (Adjacency List, Hierarchical Path (by node or by edge), and Nested Sets) and have found ways to rebuild the entire Nested Set structure very, very quickly. If you read Joe's book that I previously cited, you'll find that there are some HUGE advantages to maintaining "gapless" and "sequential" numbering of the Bowers. Joe has code listed in his book to do all of that.
I don't do it quite that way. Since I maintain all 3 hierarchical structures in a single table, it's very easy for me to check the hierarchical path of all leaf nodes for the existance of the position I want to add (to prevent "cycles") and then I add it to the adjacency list and simply rebuild everything. Because of the high speed methods I've created to do that (MUCH faster than the typical "Push Stack" method), I've found it faster and easier to rebuild everything than to do the insert, renumber the nodes for the insert, and then resequence the whole shootin' match. To summarize, I've found it easier to do node/position maintenance using an Adjacency List while still maintaining the incredible search power of Nested Sets and, ironically, that's a part of what's taking me so long to write the article on hierarchies.
Speaking of books... Hey Joe!!! When is your rewrite of the "Trees" book coming out?
--Jeff Moden
Change is inevitable... Change for the better is not.
September 25, 2011 at 12:26 pm
Dang... I almost forgot...
I call the "Lft" and "Rgt" columns "Bowers" after the two large anchors on the front of large ships. It makes the columns a whole lot easier to talk about. Instead of saying something like "the Lft and Rgt columns", I can simply refer to them as "the Bowers". People are also less confused than when you say "The RIGHT column"... they always seem to ask "Ok, but which column is the right column to use here?" 😛
--Jeff Moden
Change is inevitable... Change for the better is not.
September 25, 2011 at 1:49 pm
Of course I won't put the employee name in the table, I did get that this is just an example. I just took the table from Joe's example and extended that to illustrate my question. In fact, I don't even want to store an organizational chart; I need to store expression trees, thousands of them. And I will need to store these in a single table as I will be processing them together as a whole, plus I have little idea yet how many of them there will be over time. My worry is that my trees, though they probably won't change on a hourly basis, are not completely static. i.e. changes will be made: not just adding new expression trees or removing trees, also re-ordering and adding or removing elements in existing trees will be common actions: many rows may be affected by a single change. I need all of these operations to be as snappy as possible, as the system will not be functional for as long as the set of expressions is not completely available and a processing phase needs to follow upon each change. I need to keep the total down-time for each change as short as possible.
Currently I've got an implementation in place that uses the mentioned linked-list methods, including the dreaded anti-cyclic checks mentioned by Joe. Processing the 650+ expressions currently in the system takes at least an hour and a half and we're still extending both the number of expressions and the tasks performed during the processing phase, so I'm looking for something faster and nested sets seemed like the best opportunity.
I do thank Mr Celko for his explanation. I was however hoping for some more detailed set modification code: for example adding an element (at the end/to a specific subset), reordering siblings, removing an element. Those are the more complicated operations to do right. In theory I know how to do it, but I haven't gotten far in implementing anything near reliable and performing. Especially your (Jeff) suggestion to rewrite an entire set's lft and rgt's in one go (using the quirky update?), instead of having to update the elements after the modified ones sounded good. I would still appreciate very much such an example.
September 25, 2011 at 5:27 pm
R.P.Rozema (9/25/2011)
Of course I won't put the employee name in the table, I did get that this is just an example. I just took the table from Joe's example and extended that to illustrate my question. In fact, I don't even want to store an organizational chart; I need to store expression trees, thousands of them. And I will need to store these in a single table as I will be processing them together as a whole, plus I have little idea yet how many of them there will be over time. My worry is that my trees, though they probably won't change on a hourly basis, are not completely static. i.e. changes will be made: not just adding new expression trees or removing trees, also re-ordering and adding or removing elements in existing trees will be common actions: many rows may be affected by a single change. I need all of these operations to be as snappy as possible, as the system will not be functional for as long as the set of expressions is not completely available and a processing phase needs to follow upon each change. I need to keep the total down-time for each change as short as possible.Currently I've got an implementation in place that uses the mentioned linked-list methods, including the dreaded anti-cyclic checks mentioned by Joe. Processing the 650+ expressions currently in the system takes at least an hour and a half and we're still extending both the number of expressions and the tasks performed during the processing phase, so I'm looking for something faster and nested sets seemed like the best opportunity.
I do thank Mr Celko for his explanation. I was however hoping for some more detailed set modification code: for example adding an element (at the end/to a specific subset), reordering siblings, removing an element. Those are the more complicated operations to do right. In theory I know how to do it, but I haven't gotten far in implementing anything near reliable and performing. Especially your (Jeff) suggestion to rewrite an entire set's lft and rgt's in one go (using the quirky update?), instead of having to update the elements after the modified ones sounded good. I would still appreciate very much such an example.
Heh... that's a hell of a change from the first-blush example. 😀
You say that "the 650+ expressions currently in the system takes at least an hour and a half". Since even the worst rendition of a "push stack" would easily "process" that many expressions in much less time than an hour and a half, could you tell me what the process you're actually using is because, based on the duration you described, I'm thinking I don't actually understand what your "process" on these expressions are.
--Jeff Moden
Change is inevitable... Change for the better is not.
September 26, 2011 at 1:34 am
The "process" is to analyze the set of expressions and generate static code from it. The expressions are the definitions for an engine processing data changes in near-real-time (using SQL Server Service Broker). But that's not what I seek assistance in, it's replacing my current linked-list structures by nested sets that I'm having troubles with in finding the correct way to do it. The expressions are currently stored in linked lists using Reverse Polish notation, so that I can avoid having to use brackets on the expressions. I found however that processing the set of linked lists is taking a lot of time: I already need to convert the expressions into expression trees to optimize, combine and evaluate them, so my idea is I better store them as expression trees too (and if not all the time, then at least during processing). I can not go into much more detail as that could give competitors to much information.
September 26, 2011 at 1:43 am
Jeff Moden (9/25/2011)
Heh... that's a hell of a change from the first-blush example. 😀
Actually, I have to admit that I'm not the Topic Starter on this one. I have -not fully intentional- hijacked someone else's topic. My apologies go out to Drammy for this. But I do think the answers to my questions will still help him too.
September 26, 2011 at 4:55 am
Speaking of that... Drammy, are you all set or is there something else you'd like to know on this topic?
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 13 posts - 16 through 27 (of 27 total)
You must be logged in to reply to this topic. Login to reply