Deleting only one of rows among duplicate rows

Question

Post reply

Deleting only one of rows among duplicate rows

suchi

Newbie

Points: 9
More actions
October 8, 2005 at 8:04 pm

#363914

hi
i have two rows with redundant data and i would like to delete only top record .Please let me know the best approach to do it .
thanks
sa

Viewing 9 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply

Ray M SSC-Insane Points: 21093 More actions · Answer 1

The only way you can do it is if your table has a unique primary key for each duplicate.

delete mytable

Where PK in (select min(a.pk)

From mytable a

join mytable b on a.dupcol = b.dupcol

where a.pk <> b.pk)

or if you do not have a unique primay key you need to

select distinct into temp table

truncate table

insert rows from temp table

Dinesh Asanka SSChampion Points: 11058 More actions · Answer 2

There are few good article on this subject in this site. search for them. You will find them usefull.

My Blog: http://dineshasanka.spaces.live.com/

EdVassie SSC Guru Points: 60545 More actions · Answer 3

Try DELETE .... WHERE CURRENT OF ...cursor. See Positioned Delete in BOL. This will allow you to delete a single row from a set of tue duplicates.

If you can identify the row to delete with a WHERE clause, it would probably be better to use DELETE ... WHERE than to use a cursor. However, for true duplicates you need a positioned delete.

Original author: https://github.com/SQL-FineBuild/Common/wiki/ 1-click install and best practice configuration of SQL Server 2019, 2017 2016, 2014, 2012, 2008 R2, 2008 and 2005.

When I give food to the poor they call me a saint. When I ask why they are poor they call me a communist - Archbishop Hélder Câmara

Jeff Moden SSC Guru Points: 1004703 More actions · Answer 4

Actualy, if you do a search for "DELETE DUPLICATES", you come up with just a couple of articles which use cursors and group bys and adding columns and using MAX/GROUP BY... and ... and... and... none of them appear to be straight forward.

So here's a straight forward, scalable approach (as a generic boilerplate). On my humble home computer, it will find and delete 400 duplicate rows in a million record table in less than a minute.

First, the disclaimer:

THIS SCRIPT DELETES DATA! DO NOT TEST ON PRODUCTION DATA! DO NOT USE ON PRODUCTION DATA UNLESS YOU HAVE A VIABLE AND RESTORABLE BACKUP FOR YOUR DATA. I AM NOT AND CANNOT BE MADE TO BE RESPONSIBLE FOR YOUR DATA IN ANY WAY, SHAPE, OR FORM!

DELETE

FROM tablename

WHERE tablename.ID IN

--The following SELECT statement acts as a derived table. The DISTINCT term is

--required because if there are 3 or more dupes for any given fields, the return

--will grow, exponentially.

(

SELECT DISTINCT t1.ID

FROM tablename AS t1

INNER JOIN tablename AS t2

-- These "ON" conditions contain the comparisons of the fields to be checked for dupes

ON t1.dupchkfield1 = t2.dupchkfield1

AND t1.dupchkfield2 = t2.dupchkfield2

--AND t1.dupchkfield3 = t2.dupchkfield3 --Add additional fields like this

--AND t1.dupchkfield4 = t2.dupchkfield4 --Add additional fields like this

-- This "ON" condition contains the comparison of the field to be checked for age.

-- "Age" can be based on date/datetime/time or auto-incrementing ID.

-- In all cases, the earlier/lower the value, the older the record is compared to others.

-- The comparison operator (">" in this example) determines which duplicate will be kept.

-- The functionality of this example can be changed by changing the comparison operator to...

-- ">" KEEPS only the OLDEST or EARLIEST record if duplicates are detected (OLD IS GREATer)

-- "<" KEEPS only the YOUNGEST or LATEST record if duplicates are detected (YOUNG is LESS-THAN OLD)

-- "<>" REMOVES ALL DUPLICATED RECORDS INCLUDING YOUNGEST AND OLDEST (be careful)

AND t1.chronologyfield > t2.chronologyfield

_________________________________________________________________________________________________________________

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Dylan Peters SSC Eights! Points: 916 More actions · Answer 5

Another way, if you don't have a primary key and don't want to use a cursor or group by:

SET ROWCOUNT 1

DELETE My Table WHERE field1 = 'DUPE'

SET ROWCOUNT 0

Dylan Peters
SQL Server DBA

Robert Newnham Valued Member Points: 71 More actions · Answer 6

--I have found that the following script works well

/* If your table is called "TheTable" and three fields "F1", "F2" and "F3" */

/*This needs to be done within a transaction. Just in case something goes wrong*/

BEGIN TRAN

/* Save Distinct Versions of the Duplicate Rows */

SELECT DISTINCT F1, F2, F3

INTO #TheTable

FROM TheTable T1

WHERE EXISTS (SELECT F1, F2, F3, Count(*)

FROM TheTable T2

WHERE T2.F1 = T1.F1

AND T2.F2 = T1.F2

AND T2.F3 = T1.F3

GROUP BY F1, F2, F3

HAVING Count(*) > 1)

/* DELETE the Duplicate Rows */

DELETE T1

FROM TheTable T1

WHERE EXISTS (SELECT F1, F2, F3, Count(*)

FROM TheTable T2

WHERE T2.F1 = T1.F1

AND T2.F2 = T1.F2

AND T2.F3 = T1.F3

GROUP BY F1, F2, F3

HAVING Count(*) > 1)

/* Restore one the saved Rows */

INSERT INTO TheTable (F1, F2, F3)

SELECT F1, F2, F3

FROM #TheTable

DROP TABLE #TheTable

COMMIT TRAN

--Hope this helps

--Robert

Jeff Moden SSC Guru Points: 1004703 More actions · Answer 7

See what I mean?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Robert Newnham Valued Member Points: 71 More actions · Answer 8

Hi Jeff,

I tried your example on table which had three fields "F1", "F2" and "F3" with no primary or unique key. I could not get it to work. Whereas I did with mine. I must be doing something wrong. Could you please enpart some more of your wisdom and suggest where I might be using your simpler approach incorrectly.

Cheers

Robert