Duplicate Records

Question

Post reply

Duplicate Records

Gold-158597

SSC Veteran

Points: 205
More actions
August 12, 2004 at 2:50 pm

#108684

Table has duplicate SSN. I want keep ony one entry for per SSN. Any duplicate entry want to delete. How can I do this? There is 9000 rows in the table.
Thanks

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply

chrisrock Old Hand Points: 369 More actions · Answer 1

Something like this..

CREATE TABLE SSNTemp (

SSN char (10) NOT NULL

)

CREATE UNIQUE

INDEX [IX_SSNTemp] ON SSNTemp ([SSN])

WITH

IGNORE_DUP_KEY

insert into SSNTemp

select ssn from yourssntable

delete from yourssntable

insert into yourssntable

select ssn from SSNTemp

_______________
Convert DTS to SSIS | Database Documentation | SSIS Performance | SSIS Monitoring

BARANITHARAN BALASUBRAMANYAM SSC Enthusiast Points: 125 More actions · Answer 2

Try this script in Query analyser

BEGIN TRANSACTION T1

SELECT DISTINCT * INTO TempTable FROM MainTable GROUP BY SSN HAVING COUNT(*) > 1

DELETE MainTable WHERE SSN IN (SELECT SSN From TempTable)

INSERT INTO MainTable SELECT * FROM TempTable

COMMIT TRANSACTION

Senthil Kumar-146857 SSC Journeyman Points: 95 More actions · Answer 3

Hi,

See this article..

http://www.sqlservercentral.com/columnists/ccubley/findinganddeletingduplicatedata.asp

Cheerz

K.Senthil Kumar

conrad coetzee SSC Enthusiast Points: 112 More actions · Answer 4

--Step 1

--if your SSNtable do not have a unique key then

Alter table SSNTable

add Unique_ID int identity

--Step 2

Select max(Unique_ID)Unique_ID,SSN into Temp_Duplicate_SSNtable

from SSNtable,

(

select count(SSN)CountSSN,SSN

from SSNtable

group by SSN

having count(SSN)> 1

)a

where a.SSN = SSNtable.SSN

group by SSN

--Step 3

Delete

from SSNtable

where Unique_ID in (select Unique_ID from Temp_Duplicate_SSNtable)

--Step 4

Drop table Temp_Duplicate_SSNtable

--Step 5 this must give you no results

select count(SSN)CountSSN,SSN

from SSNtable

group by SSN

having count(SSN)> 1

--You can drop the Unique_ID field but I suggest you keep it and add a constraint on your SSN field

--so that in future no duplicates can be inserted

Naren Bhat SSC Veteran Points: 224 More actions · Answer 5

With so many requests for deleting duplicates coming to this forum, it seems like it would be nice if Microsoft provides a query on the lines of - delete duplicate from <table> on <key>

Corie Curcillo SSC-Addicted Points: 418 More actions · Answer 6

DELETE MyTable

WHERE MyID NOT IN (

SELECT MAX (MyID)

FROM MyTable

GROUP BY SSN) --assuming MyID and SSN cannot be null

Keep in mind this will delete the oldest record, and keep the newest. If you want to keep the oldest record instead, replace the inner select w/ a SELECT MIN ().

This technique essentially does the same thing as the solution link that was posted above, only w/ a lot less code. This sub-select example is less efficient than the delete..join, but if you are only doing this as a one-time data cleanup…

Also, is the table in question acting as a parent table to other tables? If so you will need to consolidate the child records under the parent record you want to save. That is, unless your child records aren’t duplicates as well.

Also, what process are you putting in place to see to it that this doesn’t continue to happen?

Corie Curcillo
MCT, MCDBA, MCSD

Gold-158597 SSC Veteran Points: 205 More actions · Answer 7

Gold-158597

SSC Veteran

Points: 205

August 13, 2004 at 10:52 am

#518933

Thanks For the help.

Alex-132443 SSC Enthusiast Points: 136 More actions · Answer 8

you can try this:

Create table #Test(Id int,dup_id int)

GO

Insert #Test

Select 1, 100 union all -- Multi duplicates

Select 2, 100 union all

Select 3, 100 union all

Select 4, 100 union all

Select 5, 200 union all -- Single

Select 6, 100 union all

Select 7, 300 union all

Select 8, 300 union all

Select 9, 400 union all

Select 10, 400

GO

DELETE a

FROM #test a

INNER JOIN #test b ON a.dup_id = b.dup_id

AND a.id > b.id

Duplicate Records

Cookies on SQLServerCentral