How to skip duplicate record and keep on running until the end?

  • I used code below to insert data into a table from another table.

    INSERT INTO tUser (Name,Email)

    SELECT Name,Email

    FROM tUSER_NEW T

    WHERE T.EMAIL NOT IN (SELECT EMAIL FROM tUser)

    But, this code will stop running once there is a duplicate record.

    How to skip duplicate record and keep on running until the end?

  • Your code does not 'skip' anything, nor is there any concept of anything 'running until the end'. Can you state the problem more clearly? Feel free to add sample data if it helps make things more lucid.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Can you define what a 'duplicate' actually means?  Is that the same Name,Email combination - or just the same name - or just the same email?

    If the same combination identifies a 'duplicate' - then use DISTINCT on the query.  If a duplicate is defined by either the same 'name' or the same 'email' only then use row_number() in a common-table expression and exclude from inserting any rows with a row number > 1.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • I set up primary key on email column.

    I want to void inserting any duplicate record in email column.

  • This might work.

    INSERT tUser (Name,Email)
    SELECT DISTINCT Name,Email
    FROM tUSER_NEW tnew
    WHERE NOT EXISTS (SELECT * FROM tUser t where t.Email = tnew.Email)

    But if the same e-mail exists in tUSER_New for multiple Name values, it will still error. If this is the case, you would need to refine the source query still further to eliminate all duplicate e-mails – this would require you to decide on which version of NAME is retained.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Put the duplicate email from Source table into a temporary table:

    SELECT DISTINCT COUNT(0), email INTO #email FROM tUSER_NEW GROUP BY email HAVING COUNT(0) > 1

    Keep Duplicate Records in another temporary table for your reference and analysis

    SELECT name, email INTO #duplicate FROM tUSER_NEW WHERE email IN (SELECT email FROM #email)

    Now Insert the data where Emails are not matching:

    INSERT INTO tUser (Name,Email)

    SELECT Name, Email

    FROM tUSER_NEW t

    WHERE t.email NOT IN (SELECT email from #email) AND t.email NOT IN (SELECT EMAIL FROM tUser)

    Hope this will work.

  • adonetok wrote:

    I set up primary key on email column.

    I want to void inserting any duplicate record in email column.

    That creates an index.  If you modify that index to include the IGNORE_DUP_KEY  option, that should solve your problem.  Warning:  I've not tested it for performance but I can't seen that it should be.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I tried Phil Parkin's way, it is working now.

    Thank all of you.

  • adonetok wrote:

    I tried Phil Parkin's way, it is working now.

    Thank all of you.

    Cool.  Thanks for the feedback.  To be sure, though, do you understand how and why Phil's good code works so you can do it the next time something like this comes up?  And, to be equally sure, not being snarky here.  We aim to teach when we answer a post to help folks learn things.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply