October 7, 2014 at 12:50 am
Hi all,
I got the following problem to solve in TSQL. I don't want to use a cursor. But with the set based solution I am stuck. Maybe someone knows a solution for the missing part. Or even a complete different solution.
Here is my problem:
DECLARE @Tmp TABLE (CustomerID INT, CustomerLink INT, PRIMARY KEY(CustomerID))
INSERT @Tmp
VALUES(100001,0)
,(100002,100001)
,(100003,100001)
,(100004,100003)
,(100005,100006)
,(100006,100005)
,(100007,100006)
,(100008,0)
,(100009,100008)
,(100010,100013)
,(100011,100008)
,(100012,100011)
,(100013,100012)
,(100014,0)
,(100015,100016)
,(100016,100015)
/*
building groups out of the couples ("chaining" is possilbe)
desired result:
[CustomerID of a group member],[smallest CustomerID per Group]
|(GroupID)|CustomerID|CustomerLink|
|100001 |100001 |100001 |
|100001 |100002 |100001 |
|100001 |100003 |100001 |
|100001 |100004 |100001 |
|100005 |100005 |100005 |
|100005 |100006 |100005 |
|100005 |100007 |100005 |
|100008 |100008 |100008 |
|100008 |100009 |100008 |
|100008 |100010 |100008 |
|100008 |100011 |100008 |
|100008 |100012 |100008 |
|100008 |100013 |100008 |
|100014 |100014 |100014 |
|100015 |100015 |100015 |
|100016 |100015 |100015 |
*/
SELECT * FROM @Tmp
--Set CustomerLink to CustomerID when missing (it's a group for itself)
UPDATE @Tmp SET CustomerLink = CustomerID WHERE CustomerLink = 0
--Find and eliminate crossreferences
--->>>--- Missing Part ---<<<---
--update CustomerLink
WHILE EXISTS (
SELECT 1
FROM @Tmp tmp1
JOIN @Tmp tmp2
ON tmp2.CustomerID = tmp1.CustomerLink
WHERE tmp2.CustomerLink <> tmp1.CustomerLink
)
BEGIN
UPDATE tmp1
SET tmp1.CustomerLink = tmp2.CustomerLink
FROM @Tmp tmp1
JOIN @Tmp tmp2
ON tmp2.CustomerID = tmp1.CustomerLink
WHERE tmp2.CustomerLink <> tmp1.CustomerLink
END
--check result
SELECT * FROM @Tmp
/*
doesn't work for crossreferences :.(
RESULTSET:
CustomerID CustomerLink
100001 100001
100002 100001
100003 100001
100004 100001
100005 100005
100006 100006 --wrong
100007 100005
100008 100008
100009 100008
100010 100008
100011 100008
100012 100008
100013 100008
100014 100014
100015 100015
100016 100016 --wrong
*/
Thanks for any help with this little 'quiz'
Reto E.
October 7, 2014 at 2:10 am
DECLARE @Tmp TABLE (CustomerID INT, CustomerLink INT, PRIMARY KEY(CustomerID))
INSERT @Tmp VALUES
(100001,0),(100002,100001),(100003,100001),(100004,100003),
(100005,100006),(100006,100005),(100007,100006),(100008,0),
(100009,100008),(100010,100013),(100011,100008),(100012,100011),
(100013,100012),(100014,0),(100015,100016),(100016,100015)
;WITH FilledData AS (
SELECT CustomerID, CustomerLink = ISNULL(NULLIF(CustomerLink,0),CustomerID) FROM @Tmp
)
SELECT x.GroupID, f.CustomerID, f.CustomerLink
FROM FilledData f
CROSS APPLY (
SELECT GroupID = MIN(CustomerID)
FROM FilledData fi
WHERE fi.CustomerLink = f.CustomerLink
) x
ORDER BY f.CustomerID
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
October 7, 2014 at 4:11 am
Thx ChrisM for your CROSS APPLY approach. I tried something like this, but sometimes cross apply is a bit misterious to me ...
Your solution doesn't work, because you can't assume that MIN(CustomerID) will give you the correct GroupID.
Your qurey returns GroupID 100004 for CustomerID 100004. But it should be GroupID 100001 because the CustomerLink is to CustomerID 100003 and CustomerID is linked to GroupID 100001.
The CustomerLink is not always set to the correct GroupID. There can be more steps in between.
Greetings Reto
October 7, 2014 at 4:21 am
reto.eggenberger (10/7/2014)
Thx ChrisM for your CROSS APPLY approach. I tried something like this, but sometimes cross apply is a bit misterious to me ...Your solution doesn't work, because you can't assume that MIN(CustomerID) will give you the correct GroupID.
Your qurey returns GroupID 100004 for CustomerID 100004. But it should be GroupID 100001 because the CustomerLink is to CustomerID 100003 and CustomerID is linked to GroupID 100001.
The CustomerLink is not always set to the correct GroupID. There can be more steps in between.
Greetings Reto
Ah, it's recursive. Ok...
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
October 7, 2014 at 5:24 am
Yes. But you don't know the level of the CustomerID because you only got a CustomerLink.
(See attachment for the actual hierarchy)
October 7, 2014 at 5:29 am
The problem with cross reference is that it leads to indefinite recurisvenes...
So, solution would be to resolve corss reference before going to recursive query.
I have added one more test case. If you wish to display original CustomerLink, resolve the cross reference into another temp table, change recursive CTE to use it and join back to original @Tmp on CustomerId in the final query to pick up original CustomerLink it from there.
DECLARE @Tmp TABLE (CustomerID INT, CustomerLink INT, PRIMARY KEY(CustomerID))
INSERT @Tmp
VALUES (100001,0)
,(100002,100001)
,(100003,100001)
,(100004,100003)
,(100005,100006)
,(100006,100005)
,(100007,100006)
,(100008,0)
,(100009,100008)
,(100010,100013)
,(100011,100008)
,(100012,100011)
,(100013,100012)
,(100014,0)
,(100015,100016)
,(100016,100015)
,(100020,100022)
,(100021,100020)
,(100022,0)
UPDATE t1 SET CustomerLink = 0
FROM @Tmp t1
JOIN @Tmp t2
ON t2.CustomerID = t1.CustomerLink
AND t2.CustomerLink = t1.CustomerID
WHERE t1.CustomerID < t2.CustomerID
;WITH linkedC
AS
(
SELECT CustomerId, CustomerLink, NEWID() GRP
FROM @Tmp
WHERE CustomerLink = 0
UNION ALL
SELECT t.CustomerId, t.CustomerLink, GRP
FROM @Tmp t
JOIN linkedC l ON l.CustomerId = t.CustomerLink
)
SELECT MIN(CustomerId) OVER (PARTITION BY GRP) AS GroupID
,CustomerId
,CustomerLink
FROM linkedC
ORDER BY GroupId, CustomerId
Actually, you don't need to use NEWID() function for pre-grouping, CustomerID would work as well eg:
...
SELECT CustomerId, CustomerLink, CustomerId GRP
FROM @Tmp
WHERE CustomerLink = 0
...
October 7, 2014 at 5:40 am
Thanks! Eliminating the cross reference did the job. Your query delivers exactly what I was trying to achieve.
October 7, 2014 at 5:42 am
To identify circles
DECLARE @Tmp TABLE (CustomerID INT, CustomerLink INT, PRIMARY KEY(CustomerID))
INSERT @Tmp
VALUES(100001,0)
,(100002,100001)
,(100003,100001)
,(100004,100003)
,(100005,100006)
,(100006,100005)
,(100007,100006)
,(100008,0)
,(100009,100008)
,(100010,100013)
,(100011,100008)
,(100012,100011)
,(100013,100012)
,(100014,0)
,(100015,100016)
,(100016,100015)
;
WITH childrenandparents (child,parent) AS
(select '*'+CAST (CustomerID as varchar(20)) child, '*'+CAST (CustomerLink as varchar(20)) parent
from @Tmp) ,
GroupMembers (Bottom, Child, ParentGroup, Level, hierarchypath)
AS
(
-- Anchor member definition
SELECT g.child as Bottom, g.child, g.parent, 0 AS Level, convert(varchar(max), g.child + '->' + g.parent) AS hierarchypath
FROM childrenandparents AS g
UNION ALL
-- Recursive member definition
SELECT gm.bottom, g.child, g.parent, Level + 1, hierarchypath + '->' + g.parent
FROM childrenandparents as g
INNER JOIN GroupMembers AS gm
ON gm.parentgroup = g.child
where hierarchypath not like '%'+g.child +'->' + g.parent + '%'
)
select bottom, parentgroup, level, hierarchypath
from groupmembers
where hierarchypath like '%'+parentgroup + '->' + '%'
order by level desc
option(maxrecursion 10);
To eliminate circles we need a criteria which is correct of contradicting (100005,100006) ,(100006,100005).
October 7, 2014 at 5:54 am
Eugene Elutin (10/7/2014)
UPDATE t1 SET CustomerLink = 0
FROM @Tmp t1
JOIN @Tmp t2
ON t2.CustomerID = t1.CustomerLink
AND t2.CustomerLink = t1.CustomerID
WHERE t1.CustomerID < t2.CustomerID
...
Note this will not eliminate longer cross reference circles. Consider
,(100005,100006)
,(100006,100007)
,(100007,100005)
October 7, 2014 at 7:28 am
serg-52 (10/7/2014)
Eugene Elutin (10/7/2014)
UPDATE t1 SET CustomerLink = 0
FROM @Tmp t1
JOIN @Tmp t2
ON t2.CustomerID = t1.CustomerLink
AND t2.CustomerLink = t1.CustomerID
WHERE t1.CustomerID < t2.CustomerID
...
Note this will not eliminate longer cross reference circles. Consider
,(100005,100006)
,(100006,100007)
,(100007,100005)
Yep, for more complicated circualr reference (eg: (100105,100107),(100106,100107),(100107,100105))
, you need much more comrehensive resolution:
DECLARE @Tmp TABLE (CustomerID INT, CustomerLink INT, PRIMARY KEY(CustomerID))
INSERT @Tmp
VALUES (100001,0)
,(100002,100001)
,(100003,100001)
,(100004,100003)
,(100008,0)
,(100009,100008)
,(100010,100013)
,(100011,100008)
,(100012,100011)
,(100013,100012)
,(100014,0)
,(100020,100022)
,(100021,100020)
,(100022,0)
-- 1-level circular reference:
,(100015,100016)
,(100016,100015)
-- 2-level circular reference:
,(100005,100006)
,(100006,100007)
,(100007,100005)
-- mixed 2 level circular reference:
,(100105,100107)
,(100106,100107)
,(100107,100105)
-- comprehensive circular reference resolution
;WITH circleL
AS
(
SELECT CustomerId, CustomerLink, ',' + CAST(CustomerId AS NVARCHAR(MAX)) + ',' AllIds
FROM @Tmp
UNION ALL
SELECT l.CustomerId, t.CustomerLink, AllIds + ',' + CAST(t.CustomerId AS NVARCHAR(MAX)) + ','
FROM @Tmp t
JOIN circleL l ON l.CustomerLink = t.CustomerId
WHERE l.AllIds NOT LIKE '%,' + CAST(t.CustomerId AS NVARCHAR(MAX)) + ',%'
)
UPDATE T SET CustomerLink = 0
FROM @Tmp T
WHERE CustomerId IN (SELECT DISTINCT Q.MinId
FROM circleL L1
CROSS APPLY (SELECT MIN(CustomerId) FROM circleL L2 WHERE L2.AllIds LIKE '%,' + CAST(L1.CustomerId AS NVARCHAR(MAX)) + ',%') Q(MinId)
WHERE L1.CustomerId = L1.CustomerLink AND L1.CustomerLink != 0)
-- final recursive cte
;WITH linkedC
AS
(
SELECT CustomerId, CustomerLink, CustomerId GRP
FROM @Tmp
WHERE CustomerLink = 0
UNION ALL
SELECT t.CustomerId, t.CustomerLink, GRP
FROM @Tmp t
JOIN linkedC l ON l.CustomerId = t.CustomerLink
)
SELECT MIN(CustomerId) OVER (PARTITION BY GRP) AS GroupID
,CustomerId
,CustomerLink
FROM linkedC
ORDER BY GroupId, CustomerId
October 7, 2014 at 8:54 am
Hi both
Thanks again!
The latest solution works great. I gonna test it with our real-world data.
Regards, Reto E.
October 7, 2014 at 8:58 am
Hi both,
Thanks again!
The latest solution works great. I gonna test it with our real-world data.
Regards, Reto E.
Viewing 12 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply