Label Partitioning Data based on matching attributes

  • After a night of sleep I think I understand what you mean: You want all the rows for which at least one duplicate exists, and leave off any rows for which no duplicate exists. Your test data did not have any rows for which no duplicate exists, so I've added one row, a row with ItemNum 11 and attributes 'ABC'. This row will not be in the results but for all other 10 rows at least one duplicate exists, so all 10 will be returned.

    Is this what you are looking for?

    insert into T1

    values

    (1,'AAA','BBB','CCC'),

    (2,'DDD','EEE','FFF'),

    (3,'GGG','HHH','III'),

    (4,'AAA','BBB','CCC'),

    (5,'DDD','EEE','FFF'),

    (6,'GGG','HHH','III'),

    (7,'AAA','BBB','CCC'),

    (8,'DDD','EEE','FFF'),

    (9,'JJJ','KKK','LLL'),

    (10,'JJJ','KKK','LLL'),

    (11,'ABC','ABC','ABC')

    select Category =

    'Group' + CHAR(64 +

    DENSE_RANK() OVER(

    ORDER BY Attribute1,Attribute2,Attribute3)),

    t.*

    from T1 t

    where exists (

    select *

    from T1 a

    where t.Attribute1 = a.Attribute1

    and t.Attribute2 = a.Attribute2

    and t.Attribute3 = a.Attribute3

    and t.ItemNum <> a.ItemNum

    )

    This outputs:

    Category ItemNum Attribute1 Attribute2 Attribute3

    -------- ----------- ---------- ---------- ----------

    GroupA 1 AAA BBB CCC

    GroupA 4 AAA BBB CCC

    GroupA 7 AAA BBB CCC

    GroupB 8 DDD EEE FFF

    GroupB 5 DDD EEE FFF

    GroupB 2 DDD EEE FFF

    GroupC 3 GGG HHH III

    GroupC 6 GGG HHH III

    GroupD 9 JJJ KKK LLL

    GroupD 10 JJJ KKK LLL

    (10 row(s) affected)



    Posting Data Etiquette - Jeff Moden[/url]
    Posting Performance Based Questions - Gail Shaw[/url]
    Hidden RBAR - Jeff Moden[/url]
    Cross Tabs and Pivots - Jeff Moden[/url]
    Catch-all queries - Gail Shaw[/url]


    If you don't have time to do it right, when will you have time to do it over?

  • R.P.Rozema (2/15/2012)


    Combine the both solutions. Take Paul's DENSE_RANK() to generate a unique name per 'group' and add another column, using my ROW_NUMBER() suggestion to give each individual member of the group a number from 1 up to the number of members. Does that answer your questions?

    Instead of using ROW_NUMBER(), you should be using COUNT(*) OVER( PARTITION BY <your fields> )

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

Viewing 2 posts - 16 through 16 (of 16 total)

You must be logged in to reply to this topic. Login to reply