The occasional requests to build tables based on portions of an organizational hierarchy prompt many of us to hark back to our days of studying data structures, and we're often content to search for an algorithm to traverse a tree and its leaves. To be sure, Google will reveal many elegant ways to search a tree. One method might be to use a cursor and recursion, another to make use of a CTE WITH structure. In a team with several developers, at least one example of each is easy to find.
In our case, two methods failed in a very short time period, giving rise to thoughts of looking at the problem from a different angle. In one case, a recursion limit of 31 was encountered when a particular supervisor was given responsible for more than 60 contract workers; in another, a table of over 9,000 past and present employees took 20 minutes to regularly rebuild.
After not only considering the problem, but also looking at some of the methodologies in place, it occurred to me that one of the solutions, which assigned an "Employee Level," perhaps had the key to looking at the data from a different perspective. Just because you have a "hammer" in your tool belt doesn't mean that every problem is a "nail"!
In thinking about the levels of an organization, I realized that each level can be determined with a very simple and efficient SQL query. From the set of employees at an existing level, simply find all of the employees whose supervisor ID is one of that set. In other words, look at the organization horizontally, rather than vertically, almost as if it is a layer cake, and your job is to extract each layer. Repeating this for each level can get all employees at each level easily and quickly. And, this can be repeated until no rows are returned, that is, no other employees report to any of the employees at the last level found.
Assume a simple organization that looks like this:
If we decide that the CEO is at Employee Level 0, then we can easily find those employees whose immediate supervisor is the CEO, using something like this:
select EmployeeID, LastName, FirstName, MiddleName, Department, 0 as EmployeeLevel into #tmptbl from Employees where SupervisorID = 10001
Once we have these employees (#10002, #10008, and #10009), let's add them to a temporary table and assign them an Employee Level 1. Now, we can use exactly the same technique to find all employees who report to any of these (managers) with a statement like this:
select EmployeeID, LastName, FirstName, MiddleName, Department, 0 as EmployeeLevel into #tmptbl from Employees where SupervisorID = (Select EmployeeID from #tmptbl where EmployeeLevel = 1)
This gives us employees #1005, #10003, #10004, #10007, #10010, and #10006. What happens if we try to find another level below this?
select EmployeeID, LastName, FirstName, MiddleName, Department, 0 as EmployeeLevel into #tmptbl from Employees where SupervisorID = (Select EmployeeID from #tmptbl where EmployeeLevel = 2)
No rows are returned, so we know that we're done.
Even in an organization of several thousand employees, it would be most unusual to find more than, say, six or seven levels, and so the entire process can consist of just a handful of efficient SQL queries.
The proof of the pudding? The 20-minute query has been reduced to 8 seconds. The stored procedure code follows, with the following caveats: 1) a temporary table is filled, since the end target tables may be different in structure and data types and lengths – the calling code updates the target from the temporary table; 2) due to some special reporting needs, each employee's row has her/his immediate supervisor, as well as the top level leader; 3) the user of the stored procedure may supply one or more top-level employee IDs, and also which employee statuses to include.
The requirement in 2) above was to allow the creation of a table of employees reporting to each of three top-level managers who were among perhaps a half dozen reporting to the senior manager; in essence, finding a part of the organizational tree. The last requirement exists to accommodate uses by a web application which supplies a list of current employees to assign various roles, while another use is to display an already-assigned employee's name, even if that person is no longer with the organization.
The code:
****** Object: StoredProcedure [dbo].[usp_BuildHierarchy] Script Date: 7/1/2014 1:08:42 PM ******/SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO CREATE PROC [dbo].[usp_BuildHierarchy] @TreeTop VARCHAR(255) = NULL , @StatusIncl VARCHAR(255) = NULL , @RtnCd BIT = 0 AS SET QUOTED_IDENTIFIER OFF /* Author: Larry Schmidt Date: June 18, 2014 Description: Generic stored procedure that clears and then rebuilds the table tmpHierarchy with employees reporting to the employees specified in the first parameter Parameters: @TreeTop - employee number(s) to begin with, e.g., '000394203,000005150,000485229' @StatusIncl - Peoplesoft Status to include in the results, e.g., 'A,T,D,P,L,I' Usage: It is recommended that the user of this stored procedure first Lock the tmpHierarchy table and upon return, use tmpHierarchy to update their target table and then Unlock User should check RtnCd and if it is 1, retry in a few seconds because table may be locked by another user */ DECLARE @Levels INT = 0 DECLARE @sql NVARCHAR(MAX) DECLARE @Nlvl INT = 1 DECLARE @NRows INT = 1 BEGIN TRY DELETE FROM tmpHierarchy INSERT INTO tmpHierarchy ( EmployeeID , SupervisorID , LeaderID , LastName , FirstName , MiddleName , EmployeeLevel ) SELECT Employee_Id , Supervisor_Id , Employee_ID , Name_Last , Name_First , Name_Middle , CAST(@Nlvl - 1 AS VARCHAR(2)) AS EmployeeLevel FROM SCD_TRM_Peoplesoft WHERE Employee_ID IN ( SELECT Data FROM Split(@TreeTop, ',') ) AND Status IN ( SELECT Data FROM Split(@StatusIncl, ',') ) WHILE ( @NRows > 0 ) BEGIN SET @Nrows = ( SELECT COUNT(*) FROM SCD_TRM_Peoplesoft WHERE supervisor_id IN ( SELECT EmployeeId FROM tmpHierarchy WHERE EmployeeLevel = @NLvl - 1 ) AND Status IN ( SELECT Data FROM Split(@StatusIncl, ',') ) ) INSERT INTO tmpHierarchy SELECT Employee_ID , supervisor_Id , t.LeaderID , Name_Last , Name_First , Name_Middle , ( @Nlvl ) AS EmployeeLevel , GETDATE() AS RowinsertDate FROM SCD_TRM_PeopleSoft ps INNER JOIN tmpHierarchy t ON ps.Supervisor_Id = t.EmployeeID WHERE supervisor_id IN ( SELECT EmployeeId FROM tmpHierarchy WHERE EmployeeLevel = @Nlvl - 1 ) AND Status IN ( SELECT Data FROM Split(@StatusIncl, ',') ) SET @Nlvl = @Nlvl + 1 END END TRY BEGIN CATCH SET @RtnCd = 1 END CATCH GO