Filtering the duplicated records in a table!

  • Hi all,

    First time, I have dealt with filtering the duplicated records. Please help me to solve it in easy way.

    My table name: RECORD_DUP and the sample data as bellowed

    ID |TYPE |CODE | EST_COST | TOT_COST | DATE

    =====================================================

    100 |G |3 | 0000000 | 00000000 | 1/1/2007

    100 |E |2 | 0123456| 00000234 | 7/7/2002

    107 |E |1 | 0000000 | 00000000 | 2/2/2000

    107 |G |1 | 4567890| 00000987 | 3/5/1999

    109 |N |2 | 0000000| 00000000 | 2/9/2001

    109 |G |3 | 0000345| 00987651 | 3/7/2003

    I need to filter against the following:

    1. If two duplicated ID records got CODE IN ('2','3'). Then, pick the only records CODE = '3' and also update the values of EST_COST, TOT_COST if it has the value = '0000000' from the eliminate record.

    2. If two duplicated ID records got same CODE = '1'. Then, pick the only records having latest DATE and also update the EST_COST, TOT_COST the same way on first condition

    The result data table 'RECORD_NODUP'

    ID |TYPE |CODE | EST_COST | TOT_COST | DATE

    =====================================================

    100 |G |3 | 0123456 | 00000234 | 1/1/2007

    107 |E |1 | 4567890 | 00000987 | 2/2/2000

    109 |G |3 | 0000345| 00987651 | 3/7/2003

    I am not sure how to check the value inside the record to match with criteria.

    Any TSQL to help me out this matter is much appreciated. Thanks in advance.

  • See attached script, which contains one way of doing it

  • Hello Rajesh,

    Thanks for your code genius . It works perfectly. I really appreciate it.

  • Hello all,

    I have some questions back on the code that you helped me, Rajesh. Your code have solved my questions on how to filter the duplicated records, but not stored the result in a new table. Please tell me how I can store the result data that show on SQL Query Analyzer in new table? Your feedback is appreciated. Thanks

  • You can either first create a table and do a

    insert into (cols) select... or select into * from (select)

    --uncomment following line if executing second time in same sql query window

    --drop table #tmpREC_DUP

    --drop table #tblDuplicate

    select id,count(*) Rows

    into #tmpREC_DUP

    from REC_DUP

    group by id

    order by id

    --Rows with duplicates 2,3

    Select *

    into #tblDuplicate

    from

    (select b.id, b.code,CASE

    WHEN b.est_cost = 0 THEN c.est_cost

    ELSE b.est_cost

    END est_cost,

    CASE

    WHEN b.tot_cost ='0000000' THEN c.tot_cost

    ELSE b.tot_cost

    END tot_cost,

    b.date

    from #tmpREC_DUP a

    join REC_DUP b

    on a.id = b.id

    join REC_DUP c

    on a.id = c.id

    where a.rows = 2

    and b.code = 3

    and c.code = 2

    and exists(select 1

    from REC_DUP d

    where d.id = a.id

    and d.code = 2)

    UNION

    --Rows with duplicates 1,1

    select b.id, b.code,CASE

    WHEN b.est_cost = 0 THEN c.est_cost

    ELSE b.est_cost

    END est_cost,

    CASE

    WHEN b.tot_cost ='0000000' THEN c.tot_cost

    ELSE b.tot_cost

    END tot_cost,

    b.date

    from #tmpREC_DUP a

    join REC_DUP b

    on a.id = b.id

    join REC_DUP c

    on a.id = c.id

    where a.rows = 2

    and b.code = 1

    and b.date = (select max(d.date)

    from REC_DUP d

    where d.id = a.id)

    and c.code = 1

    and c.date <> (select max(d.date)

    from REC_DUP d

    where d.id = a.id)

    and exists(select 1

    from REC_DUP d

    where d.id = a.id

    and d.code = 1)

    ) as tbldup

  • Hello,

    Thanks for your feedback. You're the best.

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply