Delete duplicate rows in a Table?

Question

Delete duplicate rows in a Table?

rajagopalanseeth

SSC Enthusiast

Points: 152
More actions
March 29, 2010 at 10:46 am

#138035

Hi,
I am new to SQL. I am using SQL2005. I have a table that has the following structure
SET ANSI_NULLS OFF
GO
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING OFF
GO
CREATE TABLE [dbo].[folders_rel](
[id] [varchar](36) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
[folder_id] [varchar](36) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
[polymorphic_module] [varchar](25) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
[polymorphic_id] [varchar](36) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
[deleted] [bit] NULL DEFAULT ('0'),
CONSTRAINT [pk_folders_rel] PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (IGNORE_DUP_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
There is a one-many relatioship between the "folder_id" and "polymorphic_id". The id that is generated is unique. I was migrating my datas and some duplicate rows has been inserted. It has inserted the same Folder_id and polymorphic_id more than once.. For example
id folder_id polymorphic_module polymorphic_id deleted
- -------- ------------------- ------------- -------
1 Aug Folder ed1 0
2 Aug Folder ed2 0
3 Sep Folder ed1 0
4 Sep Folder ed3 0
5 Aug Folder ed1 0
6 Aug Folder ed1 0
7 Sep Folder ed3 0
In the above example, u can see the duplicate rows has been created that has the same folder_id and polymorphic_id. That is just a sample example, i have more that 20,000 rows in the table. I want to delete duplicate rows. if there is 3 duplicate records found, i want to delete only 2 records and keep the 1 record. How to do it?
Thanks
Raja

Viewing 15 posts - 1 through 15 (of 18 total)

You must be logged in to reply to this topic. Login to reply

LutzM SSC Guru Points: 107049 More actions · Answer 1

Did you try to search for "delete duplicate rows" on this site (using the search box on the upper right corner of this site).

You probably will find this link as one reference:

http://www.sqlservercentral.com/Forums/Topic652644-338-1.aspx

Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question[/url]
How to post performance related questions[/url]
Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

Trouble Shooter SSC Eights! Points: 902 More actions · Answer 2

Hi,

Use

set rowcount 2

for delete top 2 rows of matching criteria

Paul White SSC Guru Points: 150468 More actions · Answer 3

rajagopalanseeth (3/29/2010)
I was migrating my datas and some duplicate rows has been inserted.

Create a unique constraint or index to prevent this happening again.

Ignore the advice about SET ROWCOUNT - it won't work.

Follow the link posted by Lutz.

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

Trouble Shooter SSC Eights! Points: 902 More actions · Answer 4

Trouble Shooter

SSC Eights!

Points: 902

April 12, 2010 at 12:00 am

#1149821

Hi ,

Did u try it Mr. SSCertifiable ?

Paul White SSC Guru Points: 150468 More actions · Answer 5

Trouble Shooter (4/12/2010)
Did u try it Mr. SSCertifiable?

If you mean me - and my name is Paul - then no. There was nothing to try.

The original post said:

"That is just a sample example, i have more that 20,000 rows in the table. I want to delete duplicate rows. if there is 3 duplicate records found, i want to delete only 2 records and keep the 1 record."

The idea of deleting two rows was just an example - there is no guarantee that only two records will need deleting (even per group). You posted no code or further details, so I can only assume that you completely misunderstood what was required.

Paul

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

Kingston Dhasian SSCoach Points: 19799 More actions · Answer 6

Trouble Shooter (4/12/2010)
Hi ,
Did u try it Mr. SSCertifiable ?

Look at this buddy

rajagopalanseeth (3/29/2010)
if there is 3 duplicate records found, i want to delete only 2 records and keep the 1 record. How to do it?

This is just an example and it is not going to be 2 duplicate rows always. Setting ROWCOUNT 2 will always delete 2 rows and select 2 rows as well. It was i think for this reason that your solution was not an optimal one.

Typed this when Paul was replying too. Anyways, the reason was the same:-)

Kingston Dhasian

How to post data/code on a forum to get the best help - Jeff Moden
http://www.sqlservercentral.com/articles/Best+Practices/61537/

Paul White SSC Guru Points: 150468 More actions · Answer 7

Kingston Dhasian (4/12/2010)
Typed this when Paul was replying too. Anyways, the reason was the same:-)

Yep. No worries!

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

Trouble Shooter SSC Eights! Points: 902 More actions · Answer 8

Hi,

2 is the variable only.

u can set any value there like 1 ,3 etc as per your requirement.

Kingston Dhasian SSCoach Points: 19799 More actions · Answer 9

Trouble Shooter (4/12/2010)
Hi,
2 is the variable only.
u can set any value there like 1 ,3 etc as per your requirement.

But, don't you think the method would be a bit tedious. The ROW_NUMBER() method is easier to use and is therefore a better method.

Did you know about the ROW_NUMBER() method? If you did, why do you think it is a better method? If you didn't, don't you think you have learnt something new?

Kingston Dhasian

How to post data/code on a forum to get the best help - Jeff Moden
http://www.sqlservercentral.com/articles/Best+Practices/61537/

Trouble Shooter SSC Eights! Points: 902 More actions · Answer 10

Hi,

Can u delete two duplicate row using row_number function

if yes ,please let me know

Kingston Dhasian SSCoach Points: 19799 More actions · Answer 11

Trouble Shooter (4/12/2010)
Hi,
Can u delete two duplicate row using row_number function
if yes ,please let me know

Yes, it does. It can delete any number of rows.

Try this

DECLARE@tblTable TABLE

(

Col1 INT

)

INSERT@tblTable

SELECT1 UNION ALL

SELECT2 UNION ALL

SELECT3 UNION ALL

SELECT4

; WITH cte_Table AS

(

SELECTROW_NUMBER() OVER( PARTITION BY Col1 ORDER BY Col1 ) Row_Num, *

FROM@tblTable

)

DELETE

FROMcte_Table

WHERERow_num > 1

SELECT * FROM @tblTable

Kingston Dhasian

How to post data/code on a forum to get the best help - Jeff Moden
http://www.sqlservercentral.com/articles/Best+Practices/61537/

Mitesh Oswal SSCrazy Points: 2798 More actions · Answer 12

DECLARE @tbl_Duplicate TABLE

(

ID INT

)

INSERT INTO @tbl_Duplicate VALUES(1),(2),(1),(3)

--Before

SELECT * FROM @tbl_Duplicate

;WITH CTE AS

(

SELECT

ROW_NUMBER() OVER(PARTITION BY ID ORDER BY ID) AS ROWID

FROM

@tbl_Duplicate

)

DELETE FROM CTE WHERE ROWID > 1

--AFTER

SELECT * FROM @tbl_Duplicate

Regards,
Mitesh OSwal
+918698619998

Paul White SSC Guru Points: 150468 More actions · Answer 13

Trouble Shooter (4/12/2010)
Hi,
Can u delete two duplicate row using row_number function
if yes ,please let me know

We can also do it with APPLY and TOP 😀

With ROWCOUNT, you'd need to use some horrible cursor or loop to delete a variable number of duplicates per group. The ROW_NUMBER ranking function (or APPLY/TOP) makes it easy - and fast!

Paul

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

sql_avid_fan Valued Member Points: 54 More actions · Answer 14

--delete from folders_rel

where exists (select b.id

from folders_rel a with(nolock)

inner join folders_rel b with(nolock) on b.folder_id = a.folder_id

and b.polymorphic_id = a.polymorphic_id

and b.id > a.id

where b.id = folders_rel.id)