March 5, 2012 at 3:37 pm
Hi all,
I have the followings rows:
ID VALUE1 VALUE2 TYPE
1 1 3 1
2 15 10 4
3 10 15 7
4 10 15 7
5 15 10 8
6 3 2 12
In my case, i consider duplicated rows if:
- VALUE1 and VALUE2 are equals in other row (example: rows 3 and 4)
- VALUE1 is equal to VALUE2 and VALUE2 is equal to VALUE1 in other row (example: rows 2 and 3)
So, i will need a script to execute in SQL Server 2008 to delete this duplicated rows, to get:
ID VALUE1 VALUE2 TYPE
1 1 3 1
2 15 10 4
6 3 2 12
Thanks in advance.
March 5, 2012 at 3:47 pm
One way to do it is to use a self-join, and if you had provided the script to create your data, I could give you the desired query. Anyway, it would look like this:
DELETE t1
FROM dbo.YourTable t1
JOIN dbo.YourTable t2
ON t1.c2 = t2.c2
AND t1.c3 = t2.c3
AND t1.c1 > t2.c1
March 5, 2012 at 3:48 pm
Going by your logic, shouldnt row 2 be deleted also?
March 5, 2012 at 3:52 pm
ColdCoffee (3/5/2012)
Going by your logic, shouldnt row 2 be deleted also?
No, i need to get the duplicated row once.
thank yo.
March 5, 2012 at 5:01 pm
This is one way...but almost certainly not the best way....works ok on a million rows
BEGIN TRAN ; -- just for repeatability
CREATE TABLE #DATA(ID INT, VALUE1 INT,VALUE2 INT,TYPE INT);
INSERT #DATA(ID,VALUE1,VALUE2,TYPE)
SELECT 1, 1, 3, 1 UNION ALL
SELECT 2, 15, 10, 4 UNION ALL
SELECT 3, 10, 15, 7 UNION ALL
SELECT 4, 10, 15, 7 UNION ALL
SELECT 5, 15, 10, 8 UNION ALL
SELECT 6, 3, 2, 12 ;
-- add a million test rows
INSERT #DATA(ID,VALUE1,VALUE2,TYPE)
SELECT TOP 1000000 6+ROW_NUMBER() OVER (ORDER BY @@SPID),ABS(CHECKSUM(NEWID()))%100,ABS(CHECKSUM(NEWID()))%100,0
FROM syscolumns a , syscolumns b ;
-- This is the part that does the de-duping...
;WITH data_cte AS
(
SELECT ID,VALUE1,VALUE2,TYPE,ROW_NUMBER() OVER(PARTITION BY V1,V2 ORDER BY ID) AS rownum
FROM #DATA
CROSS APPLY (
SELECT
CASE WHEN VALUE1<VALUE2 THEN VALUE1 ELSE VALUE2 END
,CASE WHEN VALUE1<VALUE2 THEN VALUE2 ELSE VALUE1 END
) AS ca(V1,V2)
)
DELETE
FROM data_cte
WHERE rownum <> 1 ;
SELECT ID,VALUE1,VALUE2,TYPE
FROM #DATA ;
ROLLBACK ;
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
March 8, 2012 at 6:03 am
mister.magoo (3/5/2012)
This is one way...but almost certainly not the best way....works ok on a million rows
BEGIN TRAN ; -- just for repeatability
CREATE TABLE #DATA(ID INT, VALUE1 INT,VALUE2 INT,TYPE INT);
INSERT #DATA(ID,VALUE1,VALUE2,TYPE)
SELECT 1, 1, 3, 1 UNION ALL
SELECT 2, 15, 10, 4 UNION ALL
SELECT 3, 10, 15, 7 UNION ALL
SELECT 4, 10, 15, 7 UNION ALL
SELECT 5, 15, 10, 8 UNION ALL
SELECT 6, 3, 2, 12 ;
-- add a million test rows
INSERT #DATA(ID,VALUE1,VALUE2,TYPE)
SELECT TOP 1000000 6+ROW_NUMBER() OVER (ORDER BY @@SPID),ABS(CHECKSUM(NEWID()))%100,ABS(CHECKSUM(NEWID()))%100,0
FROM syscolumns a , syscolumns b ;
-- This is the part that does the de-duping...
;WITH data_cte AS
(
SELECT ID,VALUE1,VALUE2,TYPE,ROW_NUMBER() OVER(PARTITION BY V1,V2 ORDER BY ID) AS rownum
FROM #DATA
CROSS APPLY (
SELECT
CASE WHEN VALUE1<VALUE2 THEN VALUE1 ELSE VALUE2 END
,CASE WHEN VALUE1<VALUE2 THEN VALUE2 ELSE VALUE1 END
) AS ca(V1,V2)
)
DELETE
FROM data_cte
WHERE rownum <> 1 ;
SELECT ID,VALUE1,VALUE2,TYPE
FROM #DATA ;
ROLLBACK ;
Hi mister.magoo!,
I have tested this example and it solves my problem.
Thank you very much for your help.
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply