February 19, 2004 at 7:06 am
Hi all,
I need te remove some records (about 5000) from a table.
The records look like below.
101740 Henk Krassenburg 3034 16112
101741 Jerry Kloot 3340 16111
101751 Jerry Kloot 3340 16111
101754 Jerry Kloot 3340 16111
101752 Jan van Harten 3350 16111
101742 Jan van Harten 3350 16111
101743 Cor Megens 3600 16111
101750 Cor Megens 3600 16111
101749 Janus van Haertens 655 16112
101753 Janus van Haertens 655 16112
We need to remove the records if the third column is duplicate and the first column has the highest number.
Can someone help me out here ?
Thanks in advance
February 19, 2004 at 8:48 am
You have to use some kind temp table.
This should work, I tested it with a couple of records
Select Min(Column1) as Column1,Column3,Count(Column3) as Count
Into #Temp_Keep
from Table1
Group By Column3
Having Count(Column3)>1
Delete From Table1
Where Column1 Not In (Select Column1 from #Temp_Keep) and Column3 IN (Select Column3 from #Temp_Keep)
February 19, 2004 at 9:04 am
If you have some kind of index value in the row, you can remove them directly. For example, if there is an ID or an entry date associated with the row. Let's choose date:
DELETE FROM <table> WHERE EXISTS (SELECT * FROM <table> dupEnt WHERE dupEnt.RowDate > <table>.RowDate )
Guarddata-
February 19, 2004 at 1:39 pm
First of all, let's assume that your first column has unique value.
Second, let's name the columns as Col1 through Col5.
DELETE FROM YourTable
FROM YourTable LEFT OUTER JOIN
(SELECT Col3, MAX(Col1) AS Col1 FROM YourTable GROUP BY Col3) U
ON
YourTable.Col1 = U.Col1
WHERE U.Col1 IS NULL
Good luck!
February 19, 2004 at 3:15 pm
Give this a try...
SET NOCOUNT ON
IF EXISTS(SELECT * FROM tempdb..sysobjects WHERE id = object_id('tempdb..#Foo'))
DROP TABLE #Foo
CREATE TABLE #Foo
(
FLD1 int
, FLD2 nvarchar(25)
, FLD3 int
, FLD4 int
)
INSERT INTO #Foo
VALUES (101740, 'Henk Krassenburg', 3034, 16112)
INSERT INTO #Foo
VALUES (101741 ,'Jerry Kloot' ,3340 ,16111)
INSERT INTO #Foo
VALUES (101751 ,'Jerry Kloot' ,3340 ,16111)
INSERT INTO #Foo
VALUES (101754 ,'Jerry Kloot' ,3340 ,16111)
INSERT INTO #Foo
VALUES (101752 ,'Jan van Harten' ,3350 ,16111)
INSERT INTO #Foo
VALUES (101742 ,'Jan van Harten' ,3350 ,16111)
INSERT INTO #Foo
VALUES (101743 ,'Cor Megens' ,3600 ,16111)
INSERT INTO #Foo
VALUES (101750 ,'Cor Megens' ,3600 ,16111)
INSERT INTO #Foo
VALUES (101749 ,'Janus van Haertens' ,655 ,16112)
INSERT INTO #Foo
VALUES (101753 ,'Janus van Haertens' ,655 ,16112)
SELECT f.*
FROM #Foo f
DELETE f
FROM #Foo f
JOIN (SELECT FLD3 FROM #Foo GROUP BY FLD3 Having COUNT(*) > 1) Dupes ON f.FLD3 = Dupes.FLD3
WHERE f.FLD1 IN(SELECT MAX(FLD1) FROM #Foo WHERE FLD3 = Dupes.FLD3)
SELECT f.*
FROM #Foo f
Gary Johnson
Microsoft Natural Language Group
DBA, Sr. DB Engineer
This posting is provided "AS IS" with no warranties, and confers no rights. The opinions expressed in this post are my own and may not reflect that of my employer.
February 20, 2004 at 6:37 pm
Gary's statement removes duplicates, but not triplicates or higher. It was unclear from your statement which behavior you wanted, but this will remove all but the record with the lowest FLD1
delete f1
From #Foo f0 (nolock)
JOIN #Foo f1 (nolock)
on f0.FLD3 = f1.FLD3
Where f0.FLD1 < F1.FLD1
Signature is NULL
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply