Remove Duplicates

  • Hi Guys,

    This should be simple but I've spent too much time on it and can't see the solution. Tried using ROW_NUMBER / OVER and the idea is to delete (row number) rn>1

    Current Code:

    ----------------

    DECLARE @data TABLE (Name VARCHAR(50) , InOut VARCHAR(50) ,Ord int)

    insert into @data select 'Bob','In',1

    insert into @data select 'Bob','Out',2

    insert into @data select 'Bob','In',3

    insert into @data select 'Bob','Out',4

    insert into @data select 'Bob','Out',5

    insert into @data select 'Sam','Out',1

    insert into @data select 'Sam','Out',2

    insert into @data select 'Sam','In',3

    insert into @data select 'Sam','In',4

    insert into @data select 'Sam','Out',5

    SELECT Row,Ord,DuplicateRow= CASE WHEN Name=Name

    THEN

    (

    SELECT COUNT(*) FROM (SELECT ROW_NUMBER() OVER ( ORDER BY Name,Ord) AS ROW,

    Name,InOut FROM @data

    ) AS A

    WHERE A.Name=B.Name AND A.InOut=B.InOut AND

    A.ROW<B.ROW)+1 END,Name,InOut FROM (SELECT ROW_NUMBER() OVER (ORDER

    BY Name,Ord) AS ROW,

    Name,InOut,Ord FROM @data

    ) AS B

    -------------------

    Current Results:

    RowOrdDuplicateRowNameInOut

    1 1 1 BobIn

    2 2 1 BobOut

    3 3 2 BobIn

    4 4 2 BobOut

    5 5 3 BobOut

    6 1 1 SamOut

    7 2 2 SamOut

    8 3 1 SamIn

    9 4 2 SamIn

    10 5 3 SamOut

    Desired Results:

    RowOrdDuplicateRowNameInOut

    111 BobIn

    221 BobOut

    331 BobIn

    441 BobOut

    552 BobOut

    611 SamOut

    722 SamOut

    831 SamIn

    942 SamIn

    1051 SamOut

    Your help appreciated.

  • Note sure that I understand. You talk about deleting duplicate rows, and in that case the solution would be:

    ;WITH numbered AS (

    SELECT row_number() OVER (PARTITION BY Name, InOut ORDER BY Ord) AS rowno

    FROM @data

    )

    DELETE numbered WHERE rowno > 1

    But your desired result is different. While it is great that you included CREATE TABLE and INSERT statements with sample data as well as the desired result, you failed to include a brief description of the business rules. What does the row Duplicate in the desired result signify?

    [font="Times New Roman"]Erland Sommarskog, SQL Server MVP, www.sommarskog.se[/font]

  • Hi Erland,

    My apologies for not being clearer about the current / desired business logic.

    Current Business Logic: 'Ord' is the sequence of events for the 'Name' & 'InOut' fields. The DuplicateRow value for what's listed below only considers 'Name' & 'InOut' fields and therefore Row #5 has a field DuplicateRow value of 3 because it is the third occurance of 'Out' for 'Bob'

    Row Ord DuplicateRow Name InOut

    1 1 1 Bob In

    2 2 1 Bob Out

    3 3 2 Bob In

    4 4 2 Bob Out

    5 5 3 Bob Out

    6 1 1 Sam Out

    7 2 2 Sam Out

    8 3 1 Sam In

    9 4 2 Sam In

    10 5 3 Sam Out

    Desired Business Logic: Row #5 field 'DuplicateRow' value is 2 because it is the second occurance of 'Bob' & "Out in sequence, using the field 'Ord' as the sequence of events for the 'Name' & 'InOut' fields. The same applies to Row # 9 where 'Sam' & 'In' is the second occurance in sequence.

    Row Ord DuplicateRow Name InOut

    1 1 1 Bob In

    2 2 1 Bob Out

    3 3 1 Bob In

    4 4 1 Bob Out

    5 5 2 Bob Out

    6 1 1 Sam Out

    7 2 2 Sam Out

    8 3 1 Sam In

    9 4 2 Sam In

    10 5 1 Sam Out

    Hopefully that makes the business logic clearer.Sorry if the data provided is a bit hard to read.

    Nathan

  • Per your last post, it seems like you're saying that you want to remove rows where Ord value = 2, but that is so simple you must mean something else. Having a hard time understanding how you're defining 'duplicate'.

    --Quote me

  • I think this gives the results that you want. Then it's just a matter of wrapping it up in a delete

    select row_number() OVER (order by a.name, a.ord) Row,

    a.Ord,

    CASE When a.inout = b.inout then 2 else 1 end,

    a.name,

    a.inout

    from @data a

    left outer join @data b on a.name = b.name and a.ord = b.ord + 1

    order by 1

  • Nice work MickyT. That does the job!

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply