Updating duplicate records - based on criteria (Pretty Please, can someone help me - this is urgent!)

  • Please Please Please - Can anyone help me.

    I haven't been using SQL very long and this is driving me nuts!!:unsure:

    I have managed to extract a selection of data... this shows a number of duplicate records.

    What I would like to do is set up a flag called 'type' which will be populated by either 1 or 2 (based on the below criteria)

    SubjectUniqueIDDate1 Date2 Code

    1ABC12314/12/201116/12/2011H98

    1ABC23414/12/201116/12/2011H98

    2ABC34527/09/201106/12/2011C01

    2ABC45627/09/201106/12/2011C98

    3ABC56717/11/201108/12/2011H98

    3ABC67817/11/201115/12/2011H98

    -Where a subject has records that are identical - I would like the 1st record to be allocated type 1 and the others records allocated as 2

    -Where a subject has records that are identical but the 'Code' field is different - I would like the record with 'C01' to be allocated type 1 and the other records to be allocated as 2

    - Where a subject has duplicate records (Date1 is the same but date 2 isn't ) - I would like the 1st date to be allocated as type 1 otherwise 2

    This would be my desired outcome

    SubjectUniqueIDDate1 Date2 Code Type

    1ABC12314/12/201116/12/2011H981

    1ABC23414/12/201116/12/2011H982

    2ABC34527/09/201106/12/2011C011

    2ABC45627/09/201106/12/2011C982

    3ABC56717/11/201108/12/2011H981

    3ABC67817/11/201115/12/2011H982

    3 ABC789 17/11/2011 17/12/2011 H98 2

    Please could someone tell me what i have to do... better still provide me with some code

    This is urgent; so if anyone can help I'd really apprecaite it!

    Many Thanks

  • What's the criteria for "first row"?

    There isn't such a thing as order in a relational table.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • The 1st and 2nd line are duplicates (exact match) therefore I would like the 1st record to be allocated 1 and the following record as 2

    I have created a temp table ordering the data and its on this table that I would like to build in the rules.

    Is this possible to do?

  • Question like that should be posted in according with the "forum etiquette" (link at the bottom of my signature). Only 'cause you "first-timer" here:

    This how you should post "I have a table like that..." part:

    -- table setup DDL

    create table #table ([Subject] int

    ,UniqueID char(6)

    ,Date1 datetime

    ,Date2 datetime

    ,Codechar(3)

    )

    -- some data sample

    set dateformat dmy

    insert #table values

    (1,'ABC123', '14/12/2011', '16/12/2011', 'H98'),

    (1,'ABC234', '14/12/2011', '16/12/2011', 'H98'),

    (2,'ABC345', '27/09/2011', '06/12/2011', 'C01'),

    (2,'ABC456', '27/09/2011', '06/12/2011', 'C98'),

    (3,'ABC567', '17/11/2011', '08/12/2011', 'H98'),

    (3,'ABC678', '17/11/2011', '15/12/2011', 'H98')

    So, we can just cut-&-paste and execute it to have your case going...

    Now, what about if you have more than two duplicated records? Lets add some sample data:

    insert #table values

    (4,'ABC770', '14/12/2011', '16/12/2011', 'H98'),

    (4,'ABC771', '14/12/2011', '16/12/2011', 'H98'),

    (4,'ABC772', '14/12/2011', '16/12/2011', 'H98'),

    (5,'ABC880', '27/09/2011', '06/12/2011', 'C01'),

    (5,'ABC881', '27/09/2011', '06/12/2011', 'C98'),

    (5,'ABC882', '27/09/2011', '06/12/2011', 'C99'),

    (6,'ABC990', '17/11/2011', '08/12/2011', 'H98'),

    (6,'ABC991', '17/11/2011', '15/12/2011', 'H98'),

    (6,'ABC992', '17/11/2011', '25/12/2011', 'H98')

    To calculate "Type" based on your three rules we can use ROW_NUMBER,

    for each rule separately:

    ;with cte_num

    as

    (

    select [Subject], UniqueID, Date1, Date2, Code

    ,ROW_NUMBER() OVER (PARTITION BY [Subject], Date1, Date2, Code ORDER BY UniqueID) RN_identical

    ,ROW_NUMBER() OVER (PARTITION BY [Subject], Date1, Date2 ORDER BY UniqueID, Code) RN_difCode

    ,ROW_NUMBER() OVER (PARTITION BY [Subject], Date1 ORDER BY UniqueID, Date2, Code) RN_difDate2

    from #table

    )

    select [Subject], UniqueID, Date1, Date2, Code

    ,case when RN_identical >= RN_difCode and RN_identical >= RN_difDate2 then RN_identical

    when RN_difCode >= RN_difDate2 then RN_difCode

    else RN_difDate2

    end [Type]

    from cte_num

    order by [Subject], UniqueID

    But, if you analyse your three business rules a bit more, you can see that in reality you need only one, the last one! Just state the order of how "type" should be assigned, so you can get away with more simple query, calculating ROW_NUMBER just once:

    select [Subject], UniqueID, Date1, Date2, Code

    ,ROW_NUMBER() OVER (PARTITION BY [Subject], Date1 ORDER BY UniqueID, Date2, Code) [Type]

    from #table

    order by [Subject], UniqueID

    As per previous post, there is no "natural order" in SQL table, therefore I've used UniqueID for order, so in a "duplicate" pair, record with "lowest" UniqueID will be assigned to type 1.

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • Thank you so much for your help!

    Enjoy the rest of the week 🙂

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply