deleting duplicates, even though different key values

  • Is there a way of deleting duplicates, even though different key values? e.g.:
    123 Mary Kay
    124 Mary Kay
    125 Bob Incus
    126 Bob Incus
    get rid of only one of the duplicates like leave 123, delete 124

  • Yes.


    ;WITH cte_dups AS (
        SELECT *, ROW_NUMBER() OVER(PARTITION BY name ORDER BY id DESC) AS row_num /* keep the HIGHEST/LAST id */
        FROM dbo.table_name
    )
    DELETE FROM cte_dups
    WHERE row_num > 1

    Naturally ORDER BY id ASC to keep the first/lowest id.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • ScottPletcher - Tuesday, March 26, 2019 11:39 AM

    Yes.


    ;WITH cte_dups AS (
        SELECT *, ROW_NUMBER() OVER(PARTITION BY name ORDER BY id DESC) AS row_num /* keep the HIGHEST/LAST id */
        FROM dbo.table_name
    )
    DELETE FROM cte_dups
    WHERE row_num > 1

    Naturally ORDER BY id ASC to keep the first/lowest id.

    Thanks... within this code how would one concatenate fields that would determine a dupe?: like FirstName + LastName + City

  • briancampbellmcad - Tuesday, March 26, 2019 11:47 AM

    ScottPletcher - Tuesday, March 26, 2019 11:39 AM

    Yes.


    ;WITH cte_dups AS (
        SELECT *, ROW_NUMBER() OVER(PARTITION BY name ORDER BY id DESC) AS row_num /* keep the HIGHEST/LAST id */
        FROM dbo.table_name
    )
    DELETE FROM cte_dups
    WHERE row_num > 1

    Naturally ORDER BY id ASC to keep the first/lowest id.

    Thanks... within this code how would one concatenate fields that would determine a dupe?: like FirstName + LastName + City

    Adjust the PARTITION BY clause:
    ...
    PARTITION BY FirstName, LastName, City
    ...

    Note that actual concatenation is risky, esp. with names.  For example:
    fn = ric, ln = king
    with concat'd names becomes "equal" to:
    fn = rick, ln = ing

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • ScottPletcher - Tuesday, March 26, 2019 12:02 PM

    briancampbellmcad - Tuesday, March 26, 2019 11:47 AM

    ScottPletcher - Tuesday, March 26, 2019 11:39 AM

    Yes.


    ;WITH cte_dups AS (
        SELECT *, ROW_NUMBER() OVER(PARTITION BY name ORDER BY id DESC) AS row_num /* keep the HIGHEST/LAST id */
        FROM dbo.table_name
    )
    DELETE FROM cte_dups
    WHERE row_num > 1

    Naturally ORDER BY id ASC to keep the first/lowest id.

    Thanks... within this code how would one concatenate fields that would determine a dupe?: like FirstName + LastName + City

    Adjust the PARTITION BY clause:
    ...
    PARTITION BY FirstName, LastName, City
    ...

    Thanks!! Works perfectly

  • briancampbellmcad - Tuesday, March 26, 2019 11:26 AM

    Is there a way of deleting duplicates, even though different key values? e.g.:
    123 Mary Kay
    124 Mary Kay
    125 Bob Incus
    126 Bob Incus
    get rid of only one of the duplicates like leave 123, delete 124

    You can use row_number to identify the duplicate values...

    SELECT ID, FirstName, LastName, row_number() OVER(PARTITION BY FirstName, LastName ORDER BY ID desc) AS rn
    FROM ...
    WHERE ...

    Once you have the duplicates identified...you can put the code in a CTE and perform the delete:

    WITH dups
    AS (
    SELECT ID, FirstName, LastName, row_number() OVER(PARTITION BY FirstName, LastName ORDER BY ID desc) AS rn
    FROM ...
    WHERE ...
    )
    DELETE FROM dups WHERE rn > 1;

    To be safe - wrap the above in an explicit transaction with a rollback and only commit the transaction when you are sure it is working.

    BEGIN TRANSACTION;

    WITH dups
    AS (
    SELECT ID, FirstName, LastName, row_number() OVER(PARTITION BY FirstName, LastName ORDER BY ID desc) AS rn
    FROM ...
    WHERE ...
    )
    DELETE FROM dups WHERE rn > 1;

    ROLLBACK TRANSACTION;
    --COMMIT TRANSACTION;  --uncomment this and comment out the rollback when you are sure it is working

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply