Remove Duplicates

Question

Post reply

Remove Duplicates

Nathan Chapman

SSChasing Mays

Points: 620
More actions
August 15, 2013 at 1:38 pm

#281178

Hi Guys,
This should be simple but I've spent too much time on it and can't see the solution. Tried using ROW_NUMBER / OVER and the idea is to delete (row number) rn>1
Current Code:
----------------
DECLARE @data TABLE (Name VARCHAR(50) , InOut VARCHAR(50) ,Ord int)
insert into @data select 'Bob','In',1
insert into @data select 'Bob','Out',2
insert into @data select 'Bob','In',3
insert into @data select 'Bob','Out',4
insert into @data select 'Bob','Out',5
insert into @data select 'Sam','Out',1
insert into @data select 'Sam','Out',2
insert into @data select 'Sam','In',3
insert into @data select 'Sam','In',4
insert into @data select 'Sam','Out',5
SELECT Row,Ord,DuplicateRow= CASE WHEN Name=Name
THEN
(
SELECT COUNT(*) FROM (SELECT ROW_NUMBER() OVER ( ORDER BY Name,Ord) AS ROW,
Name,InOut FROM @data
) AS A
WHERE A.Name=B.Name AND A.InOut=B.InOut AND
A.ROW<B.ROW)+1 END,Name,InOut FROM (SELECT ROW_NUMBER() OVER (ORDER
BY Name,Ord) AS ROW,
Name,InOut,Ord FROM @data
) AS B
-------------------
Current Results:
RowOrdDuplicateRowNameInOut
1 1 1 BobIn
2 2 1 BobOut
3 3 2 BobIn
4 4 2 BobOut
5 5 3 BobOut
6 1 1 SamOut
7 2 2 SamOut
8 3 1 SamIn
9 4 2 SamIn
10 5 3 SamOut
Desired Results:
RowOrdDuplicateRowNameInOut
111 BobIn
221 BobOut
331 BobIn
441 BobOut
552 BobOut
611 SamOut
722 SamOut
831 SamIn
942 SamIn
1051 SamOut
Your help appreciated.

Viewing 6 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply

Erland Sommarskog SSC-Insane Points: 24038 More actions · Answer 1

Note sure that I understand. You talk about deleting duplicate rows, and in that case the solution would be:

;WITH numbered AS (

SELECT row_number() OVER (PARTITION BY Name, InOut ORDER BY Ord) AS rowno

FROM @data

)

DELETE numbered WHERE rowno > 1

But your desired result is different. While it is great that you included CREATE TABLE and INSERT statements with sample data as well as the desired result, you failed to include a brief description of the business rules. What does the row Duplicate in the desired result signify?

[font="Times New Roman"]Erland Sommarskog, SQL Server MVP, www.sommarskog.se[/font]

Nathan Chapman SSChasing Mays Points: 620 More actions · Answer 2

Hi Erland,

My apologies for not being clearer about the current / desired business logic.

Current Business Logic: 'Ord' is the sequence of events for the 'Name' & 'InOut' fields. The DuplicateRow value for what's listed below only considers 'Name' & 'InOut' fields and therefore Row #5 has a field DuplicateRow value of 3 because it is the third occurance of 'Out' for 'Bob'

Row Ord DuplicateRow Name InOut

1 1 1 Bob In

2 2 1 Bob Out

3 3 2 Bob In

4 4 2 Bob Out

5 5 3 Bob Out

6 1 1 Sam Out

7 2 2 Sam Out

8 3 1 Sam In

9 4 2 Sam In

10 5 3 Sam Out

Desired Business Logic: Row #5 field 'DuplicateRow' value is 2 because it is the second occurance of 'Bob' & "Out in sequence, using the field 'Ord' as the sequence of events for the 'Name' & 'InOut' fields. The same applies to Row # 9 where 'Sam' & 'In' is the second occurance in sequence.

Row Ord DuplicateRow Name InOut

1 1 1 Bob In

2 2 1 Bob Out

3 3 1 Bob In

4 4 1 Bob Out

5 5 2 Bob Out

6 1 1 Sam Out

7 2 2 Sam Out

8 3 1 Sam In

9 4 2 Sam In

10 5 1 Sam Out

Hopefully that makes the business logic clearer.Sorry if the data provided is a bit hard to read.

Nathan

polkadot SSChampion Points: 13664 More actions · Answer 3

Per your last post, it seems like you're saying that you want to remove rows where Ord value = 2, but that is so simple you must mean something else. Having a hard time understanding how you're defining 'duplicate'.

--Quote me

mickyT SSChampion Points: 10360 More actions · Answer 4

I think this gives the results that you want. Then it's just a matter of wrapping it up in a delete

select row_number() OVER (order by a.name, a.ord) Row,

a.Ord,

CASE When a.inout = b.inout then 2 else 1 end,

a.name,

a.inout

from @Data a

left outer join @Data b on a.name = b.name and a.ord = b.ord + 1

order by 1

Nathan Chapman SSChasing Mays Points: 620 More actions · Answer 5

Nathan Chapman

SSChasing Mays

Points: 620

August 15, 2013 at 8:31 pm

#1641969

Nice work MickyT. That does the job!