April 5, 2017 at 9:42 am
Hi all,
I've been racking my brain all afternoon with this one. In their wisdom, the developers of this database many years ago decided that when linking two records a separate record in a Links table would be needed for the 2 way.
i.e.
If ClientA was to be linked to ClientB, then 2 records would be created in the Links table:
-----------------------------
ClientRef | LinkedClientRef
-----------------------------
10000001 | 10000002
-----------------------------
10000002 | 10000001
-----------------------------
10000003 | 10000004
-----------------------------
10000005 | 10000006
-----------------------------
This is the table is its basic formCREATE TABLE Links (
ClientRef INT,
LinkedClientRef INT
)
Where this exists I'm trying to only select one of the records. In other words, the result I want to return is :
-----------------------------
ClientRef | LinkedClientRef
-----------------------------
10000001 | 10000002
-----------------------------
10000003 | 10000004
-----------------------------
10000005 | 10000006
-----------------------------
The trouble is this table doesn't have a primary key and no other unique column I can use a window function on to maybe assign a row number where the combination ClientRef|LinkedClientRef matches the reverse combination LinkedClientRef|ClientRef
I've been looking at it so long now I think I've lost the plot :crazy:
Is anybody able to help at all?
Thanks in advance
Regards
Steve
April 5, 2017 at 9:57 am
for these records
10000003 | 10000004
-----------------------------
10000005 | 10000006
why arent there corresponding reverse entries?.......or am I misunderstanding your post?
________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day
April 5, 2017 at 10:08 am
This will remove the duplicates
-- Create a table to hold the data
CREATE TABLE #Links (
ClientRef INT
, LinkedClientRef INT
);
-- Create the data
INSERT INTO #Links ( ClientRef, LinkedClientRef )
VALUES ( 10000001, 10000002 )
, ( 10000002, 10000001 )
, ( 10000003, 10000004 )
, ( 10000005, 10000006 );
-- Check the data before starting
SELECT *
FROM #Links AS cr;
-- Delete the duplicates
WITH cteDuplicates AS (
SELECT
ClientRef = cr.LinkedClientRef
, LinkedClientRef = cr.ClientRef
FROM #Links AS cr
)
DELETE dup
FROM #Links AS cr
INNER JOIN cteDuplicates AS dup
ON cr.ClientRef = dup.ClientRef
AND cr.LinkedClientRef = dup.LinkedClientRef
-- Check the data after deleting the duplicates
SELECT *
FROM #Links AS cr;
April 5, 2017 at 10:27 am
To select the records without deleting the duplicates
WITH cteBaseData AS (
SELECT
cr.ClientRef
, cr.LinkedClientRef
, rn = ROW_NUMBER() OVER (ORDER BY cr.ClientRef, cr.LinkedClientRef)
FROM #Links AS cr
)
, cteDuplicates AS (
SELECT
ClientRef = cr.LinkedClientRef
, LinkedClientRef = cr.ClientRef
, cr.rn
FROM cteBaseData AS cr
)
SELECT
cr.ClientRef
, cr.LinkedClientRef
FROM cteBaseData AS cr
LEFT JOIN cteDuplicates AS dup
ON cr.ClientRef = dup.ClientRef
AND cr.LinkedClientRef = dup.LinkedClientRef
WHERE cr.rn <= ISNULL(dup.rn, cr.rn)
April 5, 2017 at 10:32 am
Thanks for that, but it's not quite there. I still wish to retain one of the 2 duplicates. This code removes both rows. I need one of the rows to remain. I'll try and adapt your code to see if I can make it work.
Regards
Steve
April 5, 2017 at 10:44 am
Presumably the lower number is always first for single entries (if not, fix that! 🙂 ).
SELECT L.*
FROM Links L
WHERE L.ClientRef < L.LinkedClientRef
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
April 5, 2017 at 10:59 am
Would something like this help?
SELECT DISTINCT
CASE WHEN ClientRef <= LinkedClientRef THEN ClientRef ELSE LinkedClientRef END AS ClientRef,
CASE WHEN ClientRef <= LinkedClientRef THEN LinkedClientRef ELSE ClientRef END AS LinkedClientRef
FROM #Links AS cr;
Of Course, you should fix the data instead of the query.
April 5, 2017 at 11:24 am
smw147 - Wednesday, April 5, 2017 10:32 AMThanks for that, but it's not quite there. I still wish to retain one of the 2 duplicates. This code removes both rows. I need one of the rows to remain. I'll try and adapt your code to see if I can make it work.
The code that I posted selects the 1st duplicate.
If there are multiple duplicatse, then you can add a GROUP BY
-- Create a table to hold the data
CREATE TABLE #Links (
rowNum INT
, ClientRef INT
, LinkedClientRef INT
);
-- Create the data
INSERT INTO #Links ( rowNum, ClientRef, LinkedClientRef )
VALUES ( 1, 10000001, 10000002 )
, ( 2, 10000002, 10000001 )
, ( 3, 10000003, 10000004 )
, ( 4, 10000005, 10000006 );
-- Check the data before starting
SELECT *
FROM #Links AS cr;
-- Select all the unique records
-- This will only exclude rowNum=2
WITH cteBaseData AS (
SELECT
cr.ClientRef
, cr.LinkedClientRef
, rn = ROW_NUMBER() OVER (ORDER BY cr.ClientRef, cr.LinkedClientRef)
FROM #Links AS cr
)
, cteDuplicates AS (
SELECT
ClientRef = cr.LinkedClientRef
, LinkedClientRef = cr.ClientRef
, cr.rn
FROM cteBaseData AS cr
)
SELECT
cr.ClientRef
, cr.LinkedClientRef
FROM cteBaseData AS cr
LEFT JOIN cteDuplicates AS dup
ON cr.ClientRef = dup.ClientRef
AND cr.LinkedClientRef = dup.LinkedClientRef
WHERE cr.rn <= ISNULL(dup.rn, cr.rn)
--GROUP BY cr.ClientRef, cr.LinkedClientRef
Results
ClientRef LinkedClientRef
----------- ---------------
10000001 10000002
10000003 10000004
10000005 10000006
April 11, 2017 at 3:36 am
J Livingston SQL - Wednesday, April 5, 2017 9:57 AMfor these records10000003 | 10000004
-----------------------------
10000005 | 10000006why arent there corresponding reverse entries?.......or am I misunderstanding your post?
It's the way the software was designed many years ago before my time I'm afraid 🙁
Luis Cazares - Wednesday, April 5, 2017 10:59 AMWould something like this help?
SELECT DISTINCT
CASE WHEN ClientRef <= LinkedClientRef THEN ClientRef ELSE LinkedClientRef END AS ClientRef,
CASE WHEN ClientRef <= LinkedClientRef THEN LinkedClientRef ELSE ClientRef END AS LinkedClientRef
FROM #Links AS cr;
Of Course, you should fix the data instead of the query.
Sorry for the delay in responding, but Ive only just logged on after a few days away at a conference.
@luis Cazares - I actually did something similar to this in the end.
Thanks for all the responses 🙂
Regards
Steve
April 11, 2017 at 12:15 pm
DesNorton - Wednesday, April 5, 2017 10:08 AMThis will remove the duplicates
-- Create a table to hold the data
CREATE TABLE #Links (
ClientRef INT
, LinkedClientRef INT
);-- Create the data
INSERT INTO #Links ( ClientRef, LinkedClientRef )
VALUES ( 10000001, 10000002 )
, ( 10000002, 10000001 )
, ( 10000003, 10000004 )
, ( 10000005, 10000006 );-- Check the data before starting
SELECT *
FROM #Links AS cr;-- Delete the duplicates
WITH cteDuplicates AS (
SELECT
ClientRef = cr.LinkedClientRef
, LinkedClientRef = cr.ClientRef
FROM #Links AS cr
)
DELETE dup
FROM #Links AS cr
INNER JOIN cteDuplicates AS dup
ON cr.ClientRef = dup.ClientRef
AND cr.LinkedClientRef = dup.LinkedClientRef-- Check the data after deleting the duplicates
SELECT *
FROM #Links AS cr;
>> In their wisdom, the developers of this database many years ago decided that when linking [sic] two records [sic] a separate record [sic] in a Links table would be needed for the 2 way. <<
Did you ever watch a television show called "Malcolm in the middle"? In one of the episodes one of the sons and the family has gone off to Alaska, and is surrounded by total absurd maniacs in a logging camp. One of the characters in the camp is a grisly old mountain man, who keeps talking about "grizzly bear eggs" when he's trying to explain life here being attacked by a grizzly bear – she's just protecting her nest!
The concept of linking is not part of RDBMS. It's part of the old network databases which used pointer chains (links) and not the RDBMS concept up references and relationships. You've just described grizzly bear eggs in database terms.
>>If ClientA was to be linked [sic] to ClientB, then 2 records [sic] would be created in the Links [sic] table:<<
CREATE TABLE Recommendations
(first_client_id CHAR(15) NOT NULL ,
second_client_id CHAR(15) NOT NULL ,
CHECK (first_client_id < second_client_id)
PRIMARY KEY (first_client_id, second_client_id)
);
Now create a view with
CREATE VIEW Symetric
AS
SELECT first_client_id, second_client_id
FROM Recommendations
UNION
SELECT second_client_id, first_client_id
FROM Recommendations
Please post DDL and follow ANSI/ISO standards when asking for help.
April 12, 2017 at 2:58 am
jcelko212 32090 - Tuesday, April 11, 2017 12:15 PMDesNorton - Wednesday, April 5, 2017 10:08 AMThis will remove the duplicates
-- Create a table to hold the data
CREATE TABLE #Links (
ClientRef INT
, LinkedClientRef INT
);-- Create the data
INSERT INTO #Links ( ClientRef, LinkedClientRef )
VALUES ( 10000001, 10000002 )
, ( 10000002, 10000001 )
, ( 10000003, 10000004 )
, ( 10000005, 10000006 );-- Check the data before starting
SELECT *
FROM #Links AS cr;-- Delete the duplicates
WITH cteDuplicates AS (
SELECT
ClientRef = cr.LinkedClientRef
, LinkedClientRef = cr.ClientRef
FROM #Links AS cr
)
DELETE dup
FROM #Links AS cr
INNER JOIN cteDuplicates AS dup
ON cr.ClientRef = dup.ClientRef
AND cr.LinkedClientRef = dup.LinkedClientRef-- Check the data after deleting the duplicates
SELECT *
FROM #Links AS cr;>> In their wisdom, the developers of this database many years ago decided that when linking [sic] two records [sic] a separate record [sic] in a Links table would be needed for the 2 way. <<
Did you ever watch a television show called "Malcolm in the middle"? In one of the episodes one of the sons and the family has gone off to Alaska, and is surrounded by total absurd maniacs in a logging camp. One of the characters in the camp is a grisly old mountain man, who keeps talking about "grizzly bear eggs" when he's trying to explain life here being attacked by a grizzly bear – she's just protecting her nest!
The concept of linking is not part of RDBMS. It's part of the old network databases which used pointer chains (links) and not the RDBMS concept up references and relationships. You've just described grizzly bear eggs in database terms.
>>If ClientA was to be linked [sic] to ClientB, then 2 records [sic] would be created in the Links [sic] table:<<
CREATE TABLE Recommendations
(first_client_id CHAR(15) NOT NULL ,
second_client_id CHAR(15) NOT NULL ,
CHECK (first_client_id < second_client_id)
PRIMARY KEY (first_client_id, second_client_id)
);Now create a view with
CREATE VIEW Symetric
AS
SELECT first_client_id, second_client_id
FROM Recommendations
UNION
SELECT second_client_id, first_client_id
FROM Recommendations
Sorry, I will post the table DDL in future.
Regards
Steve
April 12, 2017 at 11:02 am
Here's another set of options...
IF OBJECT_ID('tempdb..#Links','U') IS NOT NULL
DROP TABLE #Links;
CREATE TABLE #Links (
ClientRef INT,
LinkedClientRef INT
);
-- Create the data
INSERT INTO #Links (ClientRef,LinkedClientRef)
VALUES
(10000001,10000002),
(10000002,10000001),
(10000003,10000004),
(10000005,10000006);
--==============================================
-- select valid links...
SELECT
l1.ClientRef,
l1.LinkedClientRef
FROM
#Links l1
WHERE
NOT EXISTS (
SELECT 1
FROM #Links l2
WHERE
l1.ClientRef = l2.LinkedClientRef
AND l1.LinkedClientRef = l2.ClientRef
AND l2.ClientRef < l2.LinkedClientRef
);
--==============================================
-- delete invalid links...
DELETE l1
FROM
#Links l1
WHERE
EXISTS (
SELECT 1
FROM #Links l2
WHERE
l1.ClientRef = l2.LinkedClientRef
AND l1.LinkedClientRef = l2.ClientRef
AND l2.ClientRef < l2.LinkedClientRef
);
April 12, 2017 at 12:05 pm
To add a little more detail to my first recommendation, I think you should be able to do this:
1) Update any rows where link1 > link2 to swap the links. Then you know that link1 always < link2.
2) Add a check constraint to make sure that link1 is always less than link2.
3) Remove any duplicates in the existing data.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
Viewing 13 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply