Help with a complex sql

  • I have table with structure as

    srnoidentity(1,1) int

    Fname varchar(30),

    lname varchar(30),

    dob datetime,

    batch varchar(10),

    Id1int

    Id2 int,

    Id3 int

    Now If the data exists as

    1A B1/1/2000ABC99999999900

    2A B1/1/2000ABC00999999999

    (Above set will fall in duplicate as one of the ids is notnullable and that matches with the other id ex: Id1 in 1st rec matches Id3 in 2nd rec so a duplicate)

    3A B1/1/2005ABC88888888800

    4A B1/1/2005ABC000

    (Above set wont be under duplicate as although fname, lname, dob and batch is same but id of 3rd rec which is not nullable doesnt match up with any ids in 4th rec so nota duplicate)

    5J K1/1/2004AAA00777777777

    6J K1/1/2004AAA7777777777777777770

    (Above set will fall in duplicate as one of the ids is notnullable and that matches with the other id ex: Id3 in 5th rec matches Id1 and Id2 in 6th rec so a duplicate)

    7X Y1/1/2003BBB04567891230

    8X Y1/1/2003BBB00456789123

    9X Y1/1/2003BBB000

    (Above set will fall in duplicate as one of the ids is notnullable and that matches with the other id ex: Id2 in 7th rec matches Id3 in 8th rec so a duplicate, but 9th rec wont be under a duplicate as all ids are blank and doesnt match up with 7th or 8th rec)

    Now what i tried doing is get the set of recs where fname, lname, dob and batch matches doing a group by on these column

    select fname, lname, dob, batch from table1

    group by fname, lname, dob, batch having count(*) >1

    this gives me duplicates based on abve cols how from thsi list can i get only the duplicate recs considering the Id1, Id2 and Id3 columns.

    Any help on this will be greatly apprecaited. TIA

  • Join the table to itself in a query where the record numbers don't match, but either the ID1, ID2 or ID3 do match.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • But unfortunately that gives my multiple records or am i doing something wrong

    select a.fname, a.lname, a.dob, a.batch,a.id1, a.id2, a.id3

    from table1 a, table1 b where ( a.fname = b.fname and a.lname = b.lname and a.dob=b.dob and a.batch = b.batch) and

    ((a.id1 = b.id1 or a.id1 = b.id2 or a.id1 = b.id3)

    OR (a.id2 = b.id1 or a.id2 = b.id2 or a.id2 = b.id3)

    OR (a.id3 = b.id1 or a.id3 = b.id2 or a.id3 = b.id3))

  • -- *** Test Data ***

    DECLARE @t TABLE

    (

    &nbsp&nbsp&nbsp&nbspsrno int NOT NULL

    &nbsp&nbsp&nbsp&nbsp,FName varchar(20) NOT NULL

    &nbsp&nbsp&nbsp&nbsp,LName varchar(20) NOT NULL

    &nbsp&nbsp&nbsp&nbsp,dob datetime NOT NULL

    &nbsp&nbsp&nbsp&nbsp,batch varchar(10) NOT NULL

    &nbsp&nbsp&nbsp&nbsp,Id1 int NOT NULL

    &nbsp&nbsp&nbsp&nbsp,Id2 int NOT NULL

    &nbsp&nbsp&nbsp&nbsp,Id3 int NOT NULL

    )

    INSERT INTO @t

    SELECT 1, 'A', 'B', '20000101', 'ABC', 999999999, 0, 0 UNION ALL

    SELECT 2, 'A', 'B', '20000101', 'ABC', 0, 0, 999999999 UNION ALL

    SELECT 3, 'A', 'B', '20050101', 'ABC', 888888888, 0, 0 UNION ALL

    SELECT 4, 'A', 'B', '20050101', 'ABC', 0, 0, 0 UNION ALL

    SELECT 5, 'A', 'B', '20040101', 'ABC', 0, 0, 777777777 UNION ALL

    SELECT 6, 'A', 'B', '20040101', 'ABC', 777777777, 777777777, 0 UNION ALL

    SELECT 7, 'A', 'B', '20040101', 'ABC', 0, 456789123, 0 UNION ALL

    SELECT 8, 'A', 'B', '20040101', 'ABC', 0, 0, 456789123 UNION ALL

    SELECT 9, 'A', 'B', '20040101', 'ABC', 0, 0, 0

    -- *** End Test Data ***

    SELECT T.*

    FROM @t T

    &nbsp&nbsp&nbsp&nbspLEFT JOIN

    &nbsp&nbsp&nbsp&nbsp(

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbspSELECT MIN(T1.srno) AS srno

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp,T1.FName, T1.LName, T1.dob, T1.batch, I.Id

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbspFROM @t T1

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbspJOIN

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp(

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbspSELECT DISTINCT *

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbspFROM

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp(

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbspSELECT srno

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp,CASE N.N

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbspWHEN 1 THEN Id1

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbspWHEN 2 THEN Id2

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbspELSE Id3

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbspEND AS Id

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbspFROM @t

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbspCROSS JOIN

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp(

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbspSELECT 1 UNION ALL

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbspSELECT 2 UNION ALL

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbspSELECT 3

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp) N(N)

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp) D

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbspWHERE Id > 0&nbsp&nbsp&nbsp&nbsp

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp) I

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbspON T1.srno = I.srno

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbspGROUP BY T1.FName, T1.LName, T1.dob, T1.batch, I.Id

    &nbsp&nbsp&nbsp&nbsp) D

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbspON T.srno = D.srno

    WHERE D.srno IS NOT NULL

    &nbsp&nbsp&nbsp&nbspOR (T.Id1 = 0 AND T.Id2 = 0 AND T.Id3 = 0)

  • Thank so much Ken for your sql. Its gets me 1 existance of every record where either id1,id or id3 exists. howevere for a records as this

    3 A B2005-01-01 00:00:00.000ABC888888888 0 0

    4 A B2005-01-01 00:00:00.000ABC0 0 0

    if you see these 2 recs these dont fall in duplicate category as 888888888 doesnt exists in 4th rec either as id1 or id2 or id3. how can this records be filtered out as these wont fall in duplicate record category. any thoughts on this.

    TIA

  • Did you run the OR (T.Id1 = 0 AND T.Id2 = 0 AND T.Id3 = 0) part of the query?

  • Thanks Ken. you are right if i exclude thsi part of the sql

    OR (T.Id1 = 0 AND T.Id2 = 0 AND T.Id3 = 0)

    i get the data as

    1AB2000-01-01 00:00:00.000ABC99999999900

    3AB2005-01-01 00:00:00.000ABC88888888800

    5AB2004-01-01 00:00:00.000ABC00777777777

    7AB2004-01-01 00:00:00.000ABC04567891230

    However if you see original insert for a record with srno 3 the id1 has value as '888888888' but a duplicate of that record which is srno doesnt ahve a value for id1, id2 or id3. As a result this wont fall in duplicate record category what happens if a 0 is substitued by '' or with a null value. can teh sql be tweaked to take care of that situation also

  • OR (ISNULL(NULLIF(T.Id1, ''), 0) AND ... )

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply