This is just a short post that uses a contrived example to demonstrate how to find duplicate records in a table. I needed to identify some duplicate records for a supplier recently and I wanted to make some notes on what I did for future reference.
First I will create a very simple table for this example
CREATE
TABLE [dbo].[Dups]
(
[FirstName] [nvarchar](50)
NULL,
[lastName] [nvarchar](50)
NULL,
[Company] [nvarchar](50)
NULL
)
ON [PRIMARY]
I will then add some example data, including some duplicate rows:
INSERT
INTO [C_AVG].[dbo].[Dups]
(
[FirstName],
[lastName],
[Company]
)
VALUES (
'Gethyn',
'Ellis',
'GRE'
)
INSERT
INTO [C_AVG].[dbo].[Dups]
(
[FirstName],
[lastName],
[Company]
)
VALUES (
'Lisa',
'Ellis',
'GRE'
)
INSERT
INTO [C_AVG].[dbo].[Dups]
(
[FirstName],
[lastName],
[Company]
)
VALUES (
'Ron',
'Ellis',
'GRE'
)
INSERT
INTO [C_AVG].[dbo].[Dups]
(
[FirstName],
[lastName],
[Company]
)
VALUES (
'Lisa',
'Ellis',
'GRE'
)
INSERT
INTO [C_AVG].[dbo].[Dups]
(
[FirstName],
[lastName],
[Company]
)
VALUES (
'Lisa',
'Ellis',
'GRE'
)
When I run a very simple select against this table I get the following output:
Gethyn Ellis GRE
Lisa Ellis GRE
Ron Ellis GRE
Lisa Ellis GRE
as we can see, Lisa is included in this table twice this but if you had a table with a couple million rows in this table and you suspected that it had duplicates spotting the duplicates maybe a little more difficult the following script will identify them for you:
This shows that
SELECT FirstName, lastName, Company FROM dups
GROUP
BY FirstName, lastName, Company
HAVING (COUNT
(*)
> 1)
This returns all the duplicate entries:
Lisa Ellis GRE
This only identifies the rows that exist more than once, cleaning up duplicates through deletion will be covered in another post.