Deleting Duplicate Records

Question

Post reply

Deleting Duplicate Records

Viewing 15 posts - 31 through 44 (of 44 total)

You must be logged in to reply to this topic. Login to reply

Mark Erickson-288856 SSC Rookie Points: 33 More actions · Answer 1

Cleanest and easiest way to clean up duplicates that I have seen.

Adwaita Prasad Panda SSC Rookie Points: 44 More actions · Answer 2

Adwaita Prasad Panda

SSC Rookie

Points: 44

March 11, 2007 at 10:37 pm

#694295

A very good and helpful topic

Sudarsan Srinivasan Ten Centuries Points: 1140 More actions · Answer 3

Hi Ramakrishnan,

The script you had provided was quite a bit useful but i have some reservations on this method. Actually i had drawn up a similar script but found that when the number of columns increase in the table, you will have to group by all the columns. is there any other method to delete the duplicate rows other than taking the stock into the temp table and again fetching it back to the physical table.

colin Robinson-345240 Mr or Mrs. 500 Points: 567 More actions · Answer 4

My method for duplicate removal may not suit every one. it offers the advantage if you have duplicate data but the identity column makes the data unique you can modify the select * to only the duplicated fields.

begin tran

Select Distinct * into [newtable] from [badtable]

truncate table [badtable]

insert into badtable select * from newtable

drop table [newtable]

ColinR

RyanRandall SSChampion Points: 13623 More actions · Answer 5

Here's another method. This one adds an identity column, uses it, and then drops it.

ALTER TABLE dbo.employee ADD tempid INT IDENTITY(1, 1)

DELETE dbo.employee WHERE tempid NOT IN (SELECT MIN(tempid) FROM dbo.employee GROUP BY id, NAME, salary)

ALTER TABLE dbo.employee DROP COLUMN tempid

Ryan Randall

Solutions are easy. Understanding the problem, now, that's the hard part.

SwePeso SSC-Dedicated Points: 39757 More actions · Answer 6

Or, if you use SQL Server 2005,

DELETEt1

FROM(

SELECTROW_NUMBER() OVER (PARTITION BY Col1 ORDER BY Col2 DESC) AS RecID

FROMTable1

) AS t1

WHERERecID > 1

N 56°04'39.16"
E 12°55'05.25"

Perry Citrowske Ten Centuries Points: 1243 More actions · Answer 7

My question is this....

Please correct me if I'm wrong, or how to get around the issue, but how about a real-life scenario that I have:

I have been given a logical definition of a duplicate row to be a composite of 4 columns. However, there are actually 10 columns in the table. This data has duplicates in the logical composite key, but it's not necessarily a duplicate if you included the other data fields in the "group by". If I want to copy the data to the other "temp" table, I have to include all the columns in the "group by" or I will lose the other data, but in my case, I can't do that or I will miss some duplicates.

gfabbri SSC Enthusiast Points: 187 More actions · Answer 8

sorry I read on msdn that 'Using SET ROWCOUNT will not affect DELETE, INSERT, and UPDATE statements in the next release of SQL Server'. So this method will not work anymore, will it not?

Thanks

Gio

AJ07 Ten Centuries Points: 1334 More actions · Answer 9

Hi ,

I have tried duplicate record deletion without using temp tables.

And according to me its faster and a very short query.

CREATE TABLE EMP

(EMPID INT,EMPNAME VARCHAR(100))

INSERT INTO EMP VALUES (1,'A')

INSERT INTO EMP VALUES (2,'B')

INSERT INTO EMP VALUES (3,'C')

INSERT INTO EMP VALUES (4,'D')

INSERT INTO EMP VALUES (5,'E')

INSERT INTO EMP VALUES (6,'F')

INSERT INTO EMP VALUES (7,'G')

SELECT * FROM EMP

SELECT @@ROWCOUNT

--(Initially the rowcount must be > 0 )

WHILE @@ROWCOUNT > 0

DELETE TOP(1) FROM EMP WHERE EXISTS

(SELECT E.EMPID ,COUNT(E.EMPID) FROM EMP E

WHERE EMP.EMPID=E.EMPID GROUP BY E.EMPID HAVING COUNT(E.EMPID) > 1)

SELECT * FROM EMP

Correct if wrong ........................

:smooooth:

AJ07 Ten Centuries Points: 1334 More actions · Answer 10

Hi Peso,

i tried your code but it did not work.

Are you sure it works?

Thanks

HBhagat Hall of Fame Points: 3341 More actions · Answer 11

Hi All,

I have a problem which everybody is discussing.

I need to find the duplicate record in the table. Its not a matter of just 1 column.

create table #temp

(ddatetime datetime ,

co1 int ,

col2 int )

Insert into #temp

values ('2001-11-12 13:29:00.000' , 101 , 15)

Insert into #temp

values ('2001-11-12 13:45:00.000' , 102 , 15)

Insert into #temp

values ('2001-11-12 13:50:49.000' , 101 , 15) [duplicate] 😎

Insert into #temp

values ('2001-11-12 13:50:49.000' , 101 , 15) [duplicate] 😎

Insert into #temp

values ('2001-11-12 14:00:49.000' , 101 , 15)

Insert into #temp

values ('2001-11-12 14:00:49.000' , 102 , 15)

Insert into #temp

values ('2001-11-12 14:00:49.000' , 103 , 15)

I have to find the duplicate value with the combination of the 3 columns.

Please help

SwePeso SSC-Dedicated Points: 39757 More actions · Answer 12

Ajay... Yes it does work.

DELETE t1

FROM (

SELECT ROW_NUMBER() OVER (PARTITION BY Col1 ORDER BY Col2 DESC) AS RecID

FROM Table1

) AS t1

WHERE RecID > 1

N 56°04'39.16"
E 12°55'05.25"

SwePeso SSC-Dedicated Points: 39757 More actions · Answer 13

harsha.bhagat1

SELECT ddatetime, col1, col2

FROM (select ddatetime, col1, col2, row_number() over (partition by ddatetime, col1, col2 order by ddatetime) as recid

from table1

) as d

where recid > 1

N 56°04'39.16"
E 12°55'05.25"

HBhagat Hall of Fame Points: 3341 More actions · Answer 14

HBhagat

Hall of Fame

Points: 3341

November 25, 2008 at 2:09 am

#902968

Thank you...

weblorquins SSC Rookie Points: 26 More actions · Answer 15

Delete From TableName

Where ID Not IN

{

Select MAX(ID) From TableName

Group By Col1, Col2...

}

By The help of Table name useing ID and the Group By we can delete the duplicate record from table in sql server.

For more Interview question Click on billow link.

http://sqlserver4us.blogspot.in/[/url]