August 12, 2004 at 2:50 pm
Table has duplicate SSN. I want keep ony one entry for per SSN. Any duplicate entry want to delete. How can I do this? There is 9000 rows in the table.
Thanks
August 12, 2004 at 3:53 pm
Something like this..
CREATE TABLE SSNTemp (
SSN char (10) NOT NULL
)
CREATE UNIQUE
INDEX [IX_SSNTemp] ON SSNTemp ([SSN])
WITH
IGNORE_DUP_KEY
insert into SSNTemp
select ssn from yourssntable
delete from yourssntable
insert into yourssntable
select ssn from SSNTemp
_______________
Convert DTS to SSIS | Database Documentation | SSIS Performance | SSIS Monitoring
August 12, 2004 at 9:56 pm
Try this script in Query analyser
BEGIN TRANSACTION T1
SELECT DISTINCT * INTO TempTable FROM MainTable GROUP BY SSN HAVING COUNT(*) > 1
DELETE MainTable WHERE SSN IN (SELECT SSN From TempTable)
INSERT INTO MainTable SELECT * FROM TempTable
COMMIT TRANSACTION
August 13, 2004 at 12:28 am
Hi,
See this article..
http://www.sqlservercentral.com/columnists/ccubley/findinganddeletingduplicatedata.asp
Cheerz
K.Senthil Kumar
August 13, 2004 at 4:59 am
--Step 1
--if your SSNtable do not have a unique key then
Alter table SSNTable
add Unique_ID int identity
--Step 2
Select max(Unique_ID)Unique_ID,SSN into Temp_Duplicate_SSNtable
from SSNtable,
(
select count(SSN)CountSSN,SSN
from SSNtable
group by SSN
having count(SSN)> 1
)a
where a.SSN = SSNtable.SSN
group by SSN
--Step 3
Delete
from SSNtable
where Unique_ID in (select Unique_ID from Temp_Duplicate_SSNtable)
--Step 4
Drop table Temp_Duplicate_SSNtable
--Step 5 this must give you no results
select count(SSN)CountSSN,SSN
from SSNtable
group by SSN
having count(SSN)> 1
--You can drop the Unique_ID field but I suggest you keep it and add a constraint on your SSN field
--so that in future no duplicates can be inserted
August 13, 2004 at 6:26 am
With so many requests for deleting duplicates coming to this forum, it seems like it would be nice if Microsoft provides a query on the lines of - delete duplicate from <table> on <key>
August 13, 2004 at 9:19 am
DELETE MyTable
WHERE MyID NOT IN (
SELECT MAX (MyID)
FROM MyTable
GROUP BY SSN) --assuming MyID and SSN cannot be null
Keep in mind this will delete the oldest record, and keep the newest. If you want to keep the oldest record instead, replace the inner select w/ a SELECT MIN ().
This technique essentially does the same thing as the solution link that was posted above, only w/ a lot less code. This sub-select example is less efficient than the delete..join, but if you are only doing this as a one-time data cleanup…
Also, is the table in question acting as a parent table to other tables? If so you will need to consolidate the child records under the parent record you want to save. That is, unless your child records aren’t duplicates as well.
Also, what process are you putting in place to see to it that this doesn’t continue to happen?
Corie Curcillo
MCT, MCDBA, MCSD
August 13, 2004 at 10:52 am
Thanks For the help.
August 13, 2004 at 1:36 pm
you can try this:
Create table #Test(Id int,dup_id int)
GO
Insert #Test
Select 1, 100 union all -- Multi duplicates
Select 2, 100 union all
Select 3, 100 union all
Select 4, 100 union all
Select 5, 200 union all -- Single
Select 6, 100 union all
Select 7, 300 union all
Select 8, 300 union all
Select 9, 400 union all
Select 10, 400
GO
DELETE a
FROM #test a
INNER JOIN #test b ON a.dup_id = b.dup_id
AND a.id > b.id
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply