December 15, 2015 at 2:41 am
I have a situation where I need to form a set of closed groups from information held in a single table in SQL 2012. I know I can do this via a series of "while" loops but given that there could be several hundred thousand records in the table, this seems like an extremely bad idea, I'm also sure that there must be a way of doing this "properly" but at present the solution is eluding me.
The table contains source data of a manager and a personal assistant, with a 3rd column that will be used to contain the group name, the complexity is that while a manager can have 1 or many personal assistants, a personal assistant can work for multiple managers and may have a personal assistant in their own right.
My table looks like the following example
Manager PersonalAssistant
======= =================
A B
A C
A D
B C
D E
F G
F K
H I
H J
H A
...
The requirement is that I form a number of closed groups resulting in the following table
Group Person
===== ======
Group001 A
Group001 B
Group001 C
Group001 D
Group001 E
Group001 H
Group001 I
Group001 J
Group002 F
Group002 G
Group002 K
...
Can anyone suggest how this can be implemented, or give me some guidance on where to start?
The end result will be a stored procedure that can be run as-and-when it is necessary
Thanks in advance....
December 15, 2015 at 12:11 pm
Give this a shot...
IF OBJECT_ID('tempdb..#MgtStructure', 'U') IS NOT NULL
DROP TABLE #MgtStructure;
CREATE TABLE #MgtStructure (
EmployeeID CHAR(1),
ManagerID CHAR(1)
);
INSERT#MgtStructure (ManagerID, EmployeeID) VALUES
('A', 'B'),
('A', 'C'),
('A', 'D'),
('B', 'C'),
('D', 'E'),
('F', 'G'),
('F', 'K'),
('H', 'I'),
('H', 'J'),
('H', 'A');
WITH cte_Recursive AS (
SELECT
EmployeeID = Manager.ManagerID,
ManagerID = CAST('' AS CHAR(1)),
RecustionLevel = 1,
GroupName = 'Group' + RIGHT('000' + CAST(ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS VARCHAR(3)), 3)
FROM
#MgtStructure Manager
WHERE
NOT EXISTS (SELECT 1 FROM #MgtStructure ms WHERE Manager.ManagerID = ms.EmployeeID)
GROUP BY
Manager.ManagerID
UNION ALL
SELECT
Employee.EmployeeID,
Employee.ManagerID,
RecustionLevel = r.RecustionLevel + 1,
r.GroupName
FROM
#MgtStructure Employee
JOIN cte_Recursive r
ON Employee.ManagerID = r.EmployeeID
)
SELECT
r.GroupName,
r.EmployeeID
FROM
cte_Recursive r
ORDER BY
r.EmployeeID
;
December 16, 2015 at 1:40 am
Jason, Many thanks, worked perfectly, just what was needed 🙂
December 16, 2015 at 6:59 am
Let input be as follows
INSERT#MgtStructure (ManagerID, EmployeeID) VALUES
--('A', 'B'),
('A', 'C'),
('A', 'D'),
('B', 'C'),
('D', 'E'),
('F', 'G'),
('F', 'K'),
('H', 'I'),
('H', 'J')
--,('H', 'A')
Which is correct group for C ?
As OP stated, it's not a tree.
December 16, 2015 at 6:59 am
martyn.dowsett (12/16/2015)
Jason, Many thanks, worked perfectly, just what was needed 🙂
No problem. Glad to help.
December 16, 2015 at 10:51 am
serg-52 (12/16/2015)
Let input be as follows
INSERT#MgtStructure (ManagerID, EmployeeID) VALUES
--('A', 'B'),
('A', 'C'),
('A', 'D'),
('B', 'C'),
('D', 'E'),
('F', 'G'),
('F', 'K'),
('H', 'I'),
('H', 'J')
--,('H', 'A')
Which is correct group for C ?
As OP stated, it's not a tree.
Just a guess, if we're talking "closed groups", does that mean C can only be in one group? So in that case, wouldn't there just be that many less groups generated? Obviously could be wrong here, interesting thread in any case!
December 17, 2015 at 1:45 am
According to OP statement
martyn.dowsett (12/15/2015)
the complexity is that while a manager can have 1 or many personal assistants, a personal assistant can work for multiple managers and may have a personal assistant in their own right.
it looks as graph should be considered undirected. So it's really a problem of finding connected components of the graph https://en.wikipedia.org/wiki/Connected_component_(graph_theory).
Just for fun one can try to reproduce well known algorithms in SQL but it hardly is the most suitable tool for the purpose.
December 17, 2015 at 6:23 am
Since a given assistant can answer to multiple managers, I'd imagine that they are allowed to appear in multiple groups.
In either case it's a simple matter to choose a "top 1" group if a given assistant need to be tied to a single group.
The only real danger is the existence of a "closed loop". Like this...
INSERT#MgtStructure (ManagerID, EmployeeID) VALUES
('M', 'N'),
('N', 'O'),
('O', 'P'),
('P', 'M');
Since there is no root node (an individual who doesn't exist as an assistant) the entire group will be ignored.
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply