deleting duplicates, even though different key values

Question

deleting duplicates, even though different key values

DaveBriCam

SSCarpal Tunnel

Points: 4298
More actions
March 26, 2019 at 11:26 am

#2031647

Is there a way of deleting duplicates, even though different key values? e.g.:
123 Mary Kay
124 Mary Kay
125 Bob Incus
126 Bob Incus
get rid of only one of the duplicates like leave 123, delete 124

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply

ScottPletcher SSC Guru Points: 101098 More actions · Answer 1

Yes.

;WITH cte_dups AS ( SELECT *, ROW_NUMBER() OVER(PARTITION BY name ORDER BY id DESC) AS row_num /* keep the HIGHEST/LAST id */ FROM dbo.table_name ) DELETE FROM cte_dups WHERE row_num > 1

Naturally ORDER BY id ASC to keep the first/lowest id.

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

DaveBriCam SSCarpal Tunnel Points: 4298 More actions · Answer 2

ScottPletcher - Tuesday, March 26, 2019 11:39 AM
Yes.
;WITH cte_dups AS ( SELECT *, ROW_NUMBER() OVER(PARTITION BY name ORDER BY id DESC) AS row_num /* keep the HIGHEST/LAST id */ FROM dbo.table_name ) DELETE FROM cte_dups WHERE row_num > 1
Naturally ORDER BY id ASC to keep the first/lowest id.

Thanks... within this code how would one concatenate fields that would determine a dupe?: like FirstName + LastName + City

ScottPletcher SSC Guru Points: 101098 More actions · Answer 3

briancampbellmcad - Tuesday, March 26, 2019 11:47 AM
ScottPletcher - Tuesday, March 26, 2019 11:39 AM
Yes.
;WITH cte_dups AS ( SELECT *, ROW_NUMBER() OVER(PARTITION BY name ORDER BY id DESC) AS row_num /* keep the HIGHEST/LAST id */ FROM dbo.table_name ) DELETE FROM cte_dups WHERE row_num > 1
Naturally ORDER BY id ASC to keep the first/lowest id.
Thanks... within this code how would one concatenate fields that would determine a dupe?: like FirstName + LastName + City

Adjust the PARTITION BY clause:
...
PARTITION BY FirstName, LastName, City
...

Note that actual concatenation is risky, esp. with names. For example:
fn = ric, ln = king
with concat'd names becomes "equal" to:
fn = rick, ln = ing

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

DaveBriCam SSCarpal Tunnel Points: 4298 More actions · Answer 4

ScottPletcher - Tuesday, March 26, 2019 12:02 PM
briancampbellmcad - Tuesday, March 26, 2019 11:47 AM
ScottPletcher - Tuesday, March 26, 2019 11:39 AM
Yes.
;WITH cte_dups AS ( SELECT *, ROW_NUMBER() OVER(PARTITION BY name ORDER BY id DESC) AS row_num /* keep the HIGHEST/LAST id */ FROM dbo.table_name ) DELETE FROM cte_dups WHERE row_num > 1
Naturally ORDER BY id ASC to keep the first/lowest id.
Thanks... within this code how would one concatenate fields that would determine a dupe?: like FirstName + LastName + City
Adjust the PARTITION BY clause:
...
PARTITION BY FirstName, LastName, City
...

Thanks!! Works perfectly

Jeffrey Williams SSC Guru Points: 90303 More actions · Answer 5

briancampbellmcad - Tuesday, March 26, 2019 11:26 AM
Is there a way of deleting duplicates, even though different key values? e.g.:
123 Mary Kay
124 Mary Kay
125 Bob Incus
126 Bob Incus
get rid of only one of the duplicates like leave 123, delete 124

You can use row_number to identify the duplicate values...

SELECT ID, FirstName, LastName, row_number() OVER(PARTITION BY FirstName, LastName ORDER BY ID desc) AS rn
FROM ...
WHERE ...

Once you have the duplicates identified...you can put the code in a CTE and perform the delete:

WITH dups
AS (
SELECT ID, FirstName, LastName, row_number() OVER(PARTITION BY FirstName, LastName ORDER BY ID desc) AS rn
FROM ...
WHERE ...
)
DELETE FROM dups WHERE rn > 1;

To be safe - wrap the above in an explicit transaction with a rollback and only commit the transaction when you are sure it is working.

BEGIN TRANSACTION;

WITH dups
AS (
SELECT ID, FirstName, LastName, row_number() OVER(PARTITION BY FirstName, LastName ORDER BY ID desc) AS rn
FROM ...
WHERE ...
)
DELETE FROM dups WHERE rn > 1;

ROLLBACK TRANSACTION;
--COMMIT TRANSACTION; --uncomment this and comment out the rollback when you are sure it is working

Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

― Charles R. Swindoll

How to post questions to get better answers faster
Managing Transaction Logs