finding duplicates... help needed

  • Hi,

    I am new to TSQL and I am having a hard time figuring this out: I need to find all duplicates in a table by 2 fields, but with a third field having a different value (for example, same last name, same first name, different SSN). The table doesn't have unique keys, and I can't change that... Thanks!

  • you could do this using group by

    SELECT FirstName, LastName, COUNT(SSN) AS SSNCount

    FROM YourTable

    GROUP BY FirstName, LastName

    HAVING (COUNT(SSN) > 1)

  • First of all, a big Thank You!

    It works, but not as expected - it looks like it returns all identical names with identical SSN's - I want to find all identical names (first and last) but with different SSN's...

  • It returns all identical names with identical ssn's, as well as identical names with different ssn's, so I'm thinking I could put the results in a temp table and delete duplicates.... (keep identical names with different ssn's)

  • That doesn't work - there are no duplicates in that table because of the "group".. well, I'm lost again...

  • Guessing here, but is this what you are trying to accomplish?

    Current:

    LastName FirstName SSN

    Duck Donald 555-54-0001

    Duck Donald 555-54-0001

    Duck Donald 555-54-0002

    Desired:

    LastName FirstName SSN

    Duck Donald 555-54-0001

    Duck Donald 555-54-0002

    😎

  • yes, that is exactly what I am looking for... someone trying to use 2 (or more) different SSN's...

    Current:

    LastName FirstName SSN

    Duck Donald 555-54-0001

    Duck Donald 555-54-0001

    Duck Donald 555-54-0002

    Desired:

    LastName FirstName SSN

    Duck Donald 555-54-0001

    Duck Donald 555-54-0002

  • Here, see if that helps!

    USE TEMPDB

    GO

    CREATE TABLE Person (

    FirstName NVARCHAR(30),

    LastName NVARCHAR(30),

    SSN NVARCHAR(15))

    INSERT INTO Person

    SELECT 'Donald', 'Duck', 'SSN-001'

    UNION ALL

    SELECT 'Donald', 'Duck', 'SSN-002'

    UNION ALL

    SELECT 'Mickey', 'Mouse', 'SSN-001'

    UNION ALL

    SELECT 'Hobba', 'Hobbo', 'SSN-001'

    SELECT P.FirstName,

    p.LastName,

    p.SSN

    FROM Person p

    INNER JOIN (SELECT FirstName,

    LastName,

    COUNT(SSN) AS CountSSN

    FROM Person

    GROUP BY FirstName,LastName

    HAVING COUNT(SSN) > 1) AS Dup

    ON P.FirstName = Dup.FirstName

    AND p.LastName = Dup.LastName

    Here is the output:

    FirstNameLastNameSSN

    Donald Duck SSN-001

    Donald Duck SSN-002

    Cheers,

    J-F

  • Thanks, but I am still getting the duplicates on all 3 fields. In your table, you put:

    NSERT INTO Person

    SELECT 'Donald', 'Duck', 'SSN-001'

    UNION ALL

    SELECT 'Donald', 'Duck', 'SSN-002'

    UNION ALL

    SELECT 'Mickey', 'Mouse', 'SSN-001'

    UNION ALL

    SELECT 'Hobba', 'Hobbo', 'SSN-001'

    and is should have been:

    NSERT INTO Person

    SELECT 'Donald', 'Duck', 'SSN-001'

    UNION ALL

    SELECT 'Donald', 'Duck', 'SSN-002'

    UNION ALL

    SELECT 'Mickey', 'Mouse', 'SSN-001'

    UNION ALL

    SELECT 'Mickey', 'Mouse', 'SSN-001'

    The goal is to eliminate 'Mickey', 'Mouse', 'SSN-001' and retain

    'Donald', 'Duck', 'SSN-001'

    'Donald', 'Duck', 'SSN-002'

    Thanks!

  • Hi, you can add a Distinct on the count(SSN). Here is the code:

    SELECT P.FirstName,

    p.LastName,

    p.SSN

    FROM Person p

    INNER JOIN (SELECT FirstName,

    LastName,

    COUNT(DISTINCT SSN) AS CountSSN

    FROM Person

    GROUP BY FirstName,LastName

    HAVING COUNT(DISTINCT SSN) > 1) AS Dup

    ON P.FirstName = Dup.FirstName

    AND p.LastName = Dup.LastName

    Hope that helps!

    Cheers,

    J-F

  • Same thing... I'm getting everything, duplicates and non-duplicates.

  • One, show us the code you have developed and tried.

    Two, create a more represenative set of data that is similiar to what you are working with and what the result should be when completed.

    I am assuming that you are trying to delete duplicates, but I am having a hard time fiquring out what you are trying to delete (and not delete).

    😎

  • Same result, but not quite: I actually have the duplicates now, and I can just delete those. I think it should work, I will try it out Monday.

    Many Thanks!!

  • Well, here's the full code:

    SELECT [daily].[dbo].[CIS_Acct_Name].[Short_Last_Name],

    [daily].[dbo].[CIS_Acct_Name].[Short_First_Name],

    [daily].[dbo].[CIS_Acct_Name].[Middle_Initial],

    [daily].[dbo].[CIS_Acct_Name].[Tax_ID_Number]

    FROM [daily].[dbo].[CIS_Acct_Name]

    INNER JOIN (SELECT Short_Last_Name,

    Short_First_Name,

    Middle_Initial,

    COUNT(DISTINCT Tax_ID_Number) AS CountSSN

    FROM [daily].[dbo].[CIS_Acct_Name]

    GROUP BY Short_Last_Name, Short_First_Name, Middle_Initial

    HAVING COUNT(DISTINCT Tax_ID_Number) > 1)

    AS Dup

    ON [daily].[dbo].[CIS_Acct_Name].[Short_Last_Name] = Dup.Short_Last_Name

    AND [daily].[dbo].[CIS_Acct_Name].[Short_First_Name] = Dup.Short_First_Name

    AND [daily].[dbo].[CIS_Acct_Name].[Middle_Initial] = Dup.Middle_Initial

    WHERE [TAX_ID_Number] <> '0'

    ORDER BY Short_Last_Name, Short_First_Name

    And I am getting:

    Short_Last_Name Short_First_Name Middle_Initial Tax_ID_Number

    Duck J Donald 999999999

    Duck J Donald 999999991

    Duck J Donald 999999999

    And, Ideally, I would only want to see:

    Duck J Donald 999999999

    Duck J Donald 999999991

    (Duck J Donald 999999999 appears twice)

    But, like I was saying, I should be able to put the result in a temp table and eliminate the duplicates... Thanks again!

  • Still confused due to a previous post in this thread. Some one had added additional names that had the same SSN as D Duck, and you indicated that those name should have been dropped. So I guess what would really help is sample data that is representative of your actual data and for more than 1 individual, so we can have a better understanding of what needs to be accomplished.

    😎

Viewing 15 posts - 1 through 15 (of 27 total)

You must be logged in to reply to this topic. Login to reply