October 22, 2016 at 2:50 am
What query can I ran to show duplicate values? I know if I use the DISTINCT and UNION - this will remove duplicate values however I ONLY want the result set to display duplicate values.
The basic query I am running:
select names
from employee
All I need is the actual duplicate names to be displayed. Is this possible?
October 22, 2016 at 3:10 am
patelxx (10/22/2016)
What query can I ran to show duplicate values? I know if I use the DISTINCT and UNION - this will remove duplicate values however I ONLY want the result set to display duplicate values.The basic query I am running:
select names
from employee
All I need is the actual duplicate names to be displayed. Is this possible?
Quick suggestion which shows only the duplicated values and not the first instances of those values.
😎
USE TEEST;
GO
SET NOCOUNT ON;
;WITH SAMPLE_DATE(ID,NAME) AS
(
SELECT 1, 'Albert' UNION ALL
SELECT 2, 'Albert' UNION ALL
SELECT 3, 'Albert' UNION ALL
SELECT 4, 'Bryan' UNION ALL
SELECT 5, 'Bryan' UNION ALL
SELECT 6, 'Bryan' UNION ALL
SELECT 7, 'Chris' UNION ALL
SELECT 8, 'Chris' UNION ALL
SELECT 9, 'Chris'
)
,NUMBERED_INSTANCES AS
(
SELECT
SD.ID
,SD.NAME
,ROW_NUMBER() OVER
(
PARTITION BY SD.NAME
ORDER BY SD.ID
) AS RID
FROM SAMPLE_DATE SD
)
SELECT
NI.ID
,NI.NAME
FROM NUMBERED_INSTANCES NI
WHERE NI.RID > 1;
Output
ID NAME
--- ------
2 Albert
3 Albert
5 Bryan
6 Bryan
8 Chris
9 Chris
October 22, 2016 at 4:09 am
Many Thanks for the reply - it really worked for my results set, please can explain the below section of the code for my understanding:
NUMBERED_INSTANCES AS
(
SELECT
SD.ID
,SD.NAME
,ROW_NUMBER() OVER
(
PARTITION BY SD.NAME
ORDER BY SD.ID
) AS RID
FROM SAMPLE_DATE SD
October 22, 2016 at 5:05 am
patelxx (10/22/2016)
Many Thanks for the reply - it really worked for my results set, please can explain the below section of the code for my understanding:NUMBERED_INSTANCES AS
(
SELECT
SD.ID
,SD.NAME
,ROW_NUMBER() OVER
(
PARTITION BY SD.NAME
ORDER BY SD.ID
) AS RID
FROM SAMPLE_DATE SD
The row_number function enumerates the entries within each partition, restarting the enumeration when entering a new partition values. This means that the first instance of a key or "partition by" value will have the value of one which makes it easy to filter out later in the query.
😎
The reason for using a Common Table Expression (CTE) is that the window functions such as row_number cannot be used in the where clause.
October 22, 2016 at 8:58 am
Eirikur Eiriksson (10/22/2016)
patelxx (10/22/2016)
Many Thanks for the reply - it really worked for my results set, please can explain the below section of the code for my understanding:NUMBERED_INSTANCES AS
(
SELECT
SD.ID
,SD.NAME
,ROW_NUMBER() OVER
(
PARTITION BY SD.NAME
ORDER BY SD.ID
) AS RID
FROM SAMPLE_DATE SD
The row_number function enumerates the entries within each partition, restarting the enumeration when entering a new partition values. This means that the first instance of a key or "partition by" value will have the value of one which makes it easy to filter out later in the query.
😎
The reason for using a Common Table Expression (CTE) is that the window functions such as row_number cannot be used in the where clause.
A bonus of using this approach is if you want to delete the duplicates. You can delete directly from the CTE where the row number > 1. It won't apply to all situations where you have duplicates, but it's a good tool to have if you need it.
October 24, 2016 at 12:29 pm
To list just the dups:
SELECT names /*, COUNT(*) AS dup_count*/
FROM employee
GROUP BY names
HAVING COUNT(*) > 1
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply