remove duplicate records

Question

Post reply

remove duplicate records

aldero

SSC Rookie

Points: 48
More actions
February 19, 2004 at 7:06 am

#152048

Hi all,
I need te remove some records (about 5000) from a table.
The records look like below.
101740 Henk Krassenburg 3034 16112
101741 Jerry Kloot 3340 16111
101751 Jerry Kloot 3340 16111
101754 Jerry Kloot 3340 16111
101752 Jan van Harten 3350 16111
101742 Jan van Harten 3350 16111
101743 Cor Megens 3600 16111
101750 Cor Megens 3600 16111
101749 Janus van Haertens 655 16112
101753 Janus van Haertens 655 16112
We need to remove the records if the third column is duplicate and the first column has the highest number.
Can someone help me out here ?

Thanks in advance

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply

theogre SSC Enthusiast Points: 146 More actions · Answer 1

You have to use some kind temp table.

This should work, I tested it with a couple of records

Select Min(Column1) as Column1,Column3,Count(Column3) as Count

Into #Temp_Keep

from Table1

Group By Column3

Having Count(Column3)>1

Delete From Table1

Where Column1 Not In (Select Column1 from #Temp_Keep) and Column3 IN (Select Column3 from #Temp_Keep)

guarddata SSCertifiable Points: 6583 More actions · Answer 2

If you have some kind of index value in the row, you can remove them directly. For example, if there is an ID or an entry date associated with the row. Let's choose date:

DELETE FROM <table> WHERE EXISTS (SELECT * FROM <table> dupEnt WHERE dupEnt.RowDate > <table>.RowDate )

Guarddata-

YCH SSC Eights! Points: 892 More actions · Answer 3

First of all, let's assume that your first column has unique value.

Second, let's name the columns as Col1 through Col5.

DELETE FROM YourTable

FROM YourTable LEFT OUTER JOIN

(SELECT Col3, MAX(Col1) AS Col1 FROM YourTable GROUP BY Col3) U

ON

YourTable.Col1 = U.Col1

WHERE U.Col1 IS NULL

Good luck!

gljjr SSChampion Points: 10080 More actions · Answer 4

Give this a try...

SET NOCOUNT ON

IF EXISTS(SELECT * FROM tempdb..sysobjects WHERE id = object_id('tempdb..#Foo'))

DROP TABLE #Foo

CREATE TABLE #Foo

(

FLD1 int

, FLD2 nvarchar(25)

, FLD3 int

, FLD4 int

)

INSERT INTO #Foo

VALUES (101740, 'Henk Krassenburg', 3034, 16112)

INSERT INTO #Foo

VALUES (101741 ,'Jerry Kloot' ,3340 ,16111)

INSERT INTO #Foo

VALUES (101751 ,'Jerry Kloot' ,3340 ,16111)

INSERT INTO #Foo

VALUES (101754 ,'Jerry Kloot' ,3340 ,16111)

INSERT INTO #Foo

VALUES (101752 ,'Jan van Harten' ,3350 ,16111)

INSERT INTO #Foo

VALUES (101742 ,'Jan van Harten' ,3350 ,16111)

INSERT INTO #Foo

VALUES (101743 ,'Cor Megens' ,3600 ,16111)

INSERT INTO #Foo

VALUES (101750 ,'Cor Megens' ,3600 ,16111)

INSERT INTO #Foo

VALUES (101749 ,'Janus van Haertens' ,655 ,16112)

INSERT INTO #Foo

VALUES (101753 ,'Janus van Haertens' ,655 ,16112)

SELECT f.*

FROM #Foo f

DELETE f

FROM #Foo f

JOIN (SELECT FLD3 FROM #Foo GROUP BY FLD3 Having COUNT(*) > 1) Dupes ON f.FLD3 = Dupes.FLD3

WHERE f.FLD1 IN(SELECT MAX(FLD1) FROM #Foo WHERE FLD3 = Dupes.FLD3)

SELECT f.*

FROM #Foo f

Gary Johnson
Microsoft Natural Language Group
DBA, Sr. DB Engineer

This posting is provided "AS IS" with no warranties, and confers no rights. The opinions expressed in this post are my own and may not reflect that of my employer.

Calvin Lawson SSChampion Points: 11030 More actions · Answer 5

Gary's statement removes duplicates, but not triplicates or higher. It was unclear from your statement which behavior you wanted, but this will remove all but the record with the lowest FLD1

delete f1

From #Foo f0 (nolock)

JOIN #Foo f1 (nolock)

on f0.FLD3 = f1.FLD3

Where f0.FLD1 < F1.FLD1

Signature is NULL