Query help - update duplicate rows

  • Hi,

    We have this table {uniqueid, userid, reportname}. A user can have more than one reportname. The reportname should be unique for each user, but this rule was not enforced before. So we have some users with same reportname (but unique uniqueid) appearing more than once.

    I want to select these users with duplicate reportnames and update their reportname to be reportname + uniqueid. This will make all users have unique reportnames. This is what I have so far, I am not sure how efficient this is. I can see this will also update some additional/unwanted reportnames. Any suggestion on how this query can be improved?

    UPDATE userreports SET reportname = reportname + '-' + CAST(uniqueid AS VARCHAR(5))

    WHERE userid IN (SELECT userid FROM userreports GROUP BY userid, reportname HAVING COUNT(*) > 1)

    AND reportname IN (SELECT reportname FROM userreports GROUP BY userid, reportname HAVING COUNT(*) > 1)

    Thanks.

    uniqueid userid reportname

    -----------------------------

    12345 100 REPORT1

    12346 100 REPORT2

    12347 100 REPORT2

    12348 101 REPORT1

    12349 101 REPORT1

    12350 101 REPORT2

    The updated result should be like this

    uniqueid userid reportname

    -----------------------------

    12345 100 REPORT1

    12346 100 REPORT2-12346

    12347 100 REPORT2-12347

    12348 101 REPORT1-12348

    12349 101 REPORT1-12349

    12350 101 REPORT2

  • Why you don't update 12345 and 12350?

    What makes them different?

    _____________
    Code for TallyGenerator

  • Hi ,

    You can do this by joining to a cte. Note that you need to join on both reportName and UserID otherwise you will run into trouble...

    Firstly run the following to see the potential changes

    create table # (uniqueid int, userid int, reportname varchar(128))

    insert # select 12345, 100, 'REPORT1'

    union all select 12346, 100, 'REPORT2'

    union all select 12347, 100, 'REPORT2'

    union all select 12348, 101, 'REPORT1'

    union all select 12349, 101, 'REPORT1'

    union all select 12350, 101, 'REPORT2'

    go

    with cte

    as

    (select userid, reportname from #

    group by userid, reportname

    having count(uniqueid) >1)

    select * , #.reportname + cast(uniqueid as varchar(16)) as newName

    from cte inner join # on cte.userid = #.userid and cte.reportname = #.reportname

    If that looks fine then run this to update....

    with cte

    as

    (select userid, reportname from #

    group by userid, reportname

    having count(uniqueid) >1)

    update #

    set #.reportname = #.reportname + cast(uniqueid as varchar(16))

    from cte inner join # on cte.userid = #.userid and cte.reportname = #.reportname

    Bevan

  • CTE is just "C like" replacement for derived table.

    UPDATE R

    SET reportname = reportname + '-' + CAST(uniqueid AS VARCHAR(5))

    FROM dbo.userreports R

    INNER JOIN (SELECT userid, reportname

    FROM dbo.userreports

    GROUP BY userid, reportname

    HAVING COUNT(*) > 1) DT ON R.reportname = DT.reportname AND R.userid = DT.userid

    But this will update all records having duplicated reportname.

    I could exclude "first" or "last" ones if I'd know why 12345 should not be updated and 12346 should be.

    _____________
    Code for TallyGenerator

  • I am sorry if I didn't explain myself clearly. The user 100 has two reports by same name - REPORT2. The id is unique, but we still would like the report name to be unique too. Similarly userid 101 has two reports - REPORT1, which should be updated to be unique.

    Id 12345 (for 100) and 12350 (for 101) have unique reportname for that user.

    The above query does work like I wanted. It does not update the record 12345 and 12350. Thanks.

  • Senthil T (2/26/2009)


    Hi,

    We have this table {uniqueid, userid, reportname}. A user can have more than one reportname. The reportname should be unique for each user, but this rule was not enforced before. So we have some users with same reportname (but unique uniqueid) appearing more than once.

    For better answers quicker on your next post, please read and heed the article at the link in my signature line below.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Senthil T (2/26/2009)


    Id 12345 (for 100) and 12350 (for 101) have unique reportname for that user.

    The above query does work like I wanted. It does not update the record 12345 and 12350. Thanks.

    Oh, I see what I missed in your data.

    But in my query I followed the logic - and (Oh, miracle! :hehe:) it worked correctly.

    _____________
    Code for TallyGenerator

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply