October 20, 2009 at 11:27 am
I am trying to delete duplicate records in a table. Please see below for what the records look like:
Load Number Event Code Event Description
123456 34 order placed
123457 22 back order
123456 34 order placed
123458 11 discontinued item
123457 22 back order
I have had trouble finding a query to just remove the duplicates. Everything is driven off the Load Number in this table.
Thanks as always.:-D
October 20, 2009 at 11:37 am
Hi,
have you any criteria for which record to keep?
Like is there a date field you haven't mentioned and you want to keep the most recent record, or do you just want to remove the dupe, i.e. they are true dupes with no fields different?
Thanks
Allister
October 20, 2009 at 11:39 am
The best method I know for deduping looks like this:
;with CTE as
(select row_number() over (partition by [Load Number], [Event Code] order by 1) as Row
from dbo.MyTable)
delete from CTE
where Row > 1;
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
October 20, 2009 at 11:41 am
Cool, that's pretty 🙂
October 20, 2009 at 11:48 am
///Edit - wrong thread
October 20, 2009 at 12:14 pm
I just want to keep only unique Load Numbers.
current rows (duplicates)
Load number Event Code Event Desc
123456 11 on back order
123457 22 discontinued item
123456 11 on back order
123458 33 shipped
I only want
desired rows (distinct)
Load number Event Code Event Desc
123456 11 on back order
123457 22 discontinued item
123458 33 shipped
October 20, 2009 at 12:16 pm
Just change the row_number() partition and order by in my query to suit your needs.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
October 20, 2009 at 12:21 pm
GSquared,
I had to update CTE slightly to get it to work on my system?
;with CTE as
(select row_number() over
(partition by [Load Number], [Event Code], [Event Description]
order by (select 1)) as Row
from @temp)
delete from CTE
where Row > 1;
October 20, 2009 at 12:23 pm
GSquared
thanks, one more question what does the CTE represent?
October 20, 2009 at 2:46 pm
jrw39 (10/20/2009)
GSquaredthanks, one more question what does the CTE represent?
CTE is Common Table Expression. You can look it up in Books Online for further information, but generally it can be used as a type of inline view. Gus's query could be rewritten using a derived table and would work exactly the same.
The trick with this query is using ROW_NUMBER() to generate the row numbers for each partition based upon the order you define. Then, once you have that - you can delete all rows that have a row number greater than 1.
There are other uses for CTE's based on recursion - but that is another topic altogether.
Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”
― Charles R. Swindoll
How to post questions to get better answers faster
Managing Transaction Logs
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply