July 25, 2016 at 10:37 am
Hello,
I have a table with 5 columns
A B C D E
1 test test0 test2 test3
1 test1 test1 test3 test4
1 test1 test2 test3 test5
1 test1 test3 test3 test4
I need to find the duplicates from the above excluding column C
so I would need output as :
A B C D E
1 test1 test1 test3 test4
1 test1 test3 test3 test4
Thanks.
July 25, 2016 at 10:54 am
--Removed
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
July 25, 2016 at 11:03 am
CREATE TABLE #SampleDuplicates(
A int,
B varchar(10),
C varchar(10),
D varchar(10),
E varchar(10)
);
INSERT INTO #SampleDuplicates
VALUES
(1, 'test ', 'test0', 'test2', 'test3'),
(1, 'test1', 'test1', 'test3', 'test4'),
(1, 'test1', 'test2', 'test3', 'test5'),
(1, 'test1', 'test3', 'test3', 'test4');
GO
WITH cteSampleDuplicates AS(
SELECT *, COUNT(*) OVER(PARTITION BY A,B,D,E) rowcnt
FROM #SampleDuplicates o
)
SELECT A,B,C,D,E
FROM cteSampleDuplicates
WHERE rowcnt > 1
GO
DROP TABLE #SampleDuplicates
July 25, 2016 at 11:07 am
Here's an alternative – not as elegant, but may perform better:
IF OBJECT_ID('tempdb..#tab1', 'U') IS NOT NULL
DROP TABLE #tab1;
CREATE TABLE #tab1
(
A INT
,B VARCHAR(10)
,C VARCHAR(10)
,D VARCHAR(10)
,E VARCHAR(10)
);
INSERT #tab1
(A, B, C, D, E)
VALUES (1, 'test', 'test0', 'test2', 'test3'),
(1, 'test1', 'test1', 'test3', 'test4'),
(1, 'test1', 'test2', 'test3', 'test5'),
(1, 'test1', 'test3', 'test3', 'test4');
SELECT *
FROM #tab1 t
WHERE EXISTS ( SELECT 1
FROM #tab1 tx
WHERE tx.A = t.A
AND tx.B = t.B
AND tx.D = t.D
AND tx.E = t.E
GROUP BY tx.A
, tx.B
, tx.D
, tx.E
HAVING COUNT(1) > 1 );
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
July 25, 2016 at 12:54 pm
Thank you this worked.
Other question would be how to find the count of unique values
July 25, 2016 at 1:04 pm
July 25, 2016 at 1:15 pm
Got it. I can just do distinct count.
Thank you.
July 26, 2016 at 5:31 am
Piling on, one of the fastest ways of doing this is to use Window Function with the appropriate POC index
😎
IF OBJECT_ID('tempdb..#tab1', 'U') IS NOT NULL
DROP TABLE #tab1;
CREATE TABLE #tab1
(
A INT
,B VARCHAR(10)
,C VARCHAR(10)
,D VARCHAR(10)
,E VARCHAR(10)
);
INSERT #tab1
(A, B, C, D, E)
VALUES (1, 'test', 'test0', 'test2', 'test3'),
(1, 'test1', 'test1', 'test3', 'test4'),
(1, 'test1', 'test2', 'test3', 'test5'),
(1, 'test1', 'test3', 'test3', 'test4');
CREATE NONCLUSTERED INDEX NCLIDX#tab1_A_B_D_E_INCL_C ON #tab1(A ASC, B ASC, D ASC, E ASC) INCLUDE (C);
SELECT
SD.A
,SD.B
,SD.C
,SD.D
,SD.E
,ROW_NUMBER() OVER
(
PARTITION BY SD.A
,SD.B
,SD.D
,SD.E
ORDER BY SD.A
)
FROM #tab1 SD;
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply