Deleting duplicate record

  • Eugene Elutin (9/6/2012)


    MVDBA (9/6/2012)


    example 2 wont work -you'll end up with a table called demo1 not demo

    Are you serious? :w00t:

    It is just an EXAMPLE 🙂

    So, I've named the table in a second example as #demo1 so you can run both examples at once!

    calm eugene

    i was also trying to point out that the original table did not have an id column - i couldn't figure out if you meant to join on the original table (and forgot there was no id) or whether you had overlooked this in example 2

    "and d1.id > d2.id"

    MVDBA

  • MVDBA (9/6/2012)


    Eugene Elutin (9/6/2012)


    MVDBA (9/6/2012)


    eugene... just out of interest... why avoid the windowing???

    Because using them may be slower then implementation without them...

    hmmm.... yes, but also alternate solutions might use other resources (such as disk space to create new table 🙂 )

    many ways to skin a cat though.. really depends on what "other data" was in the table that determined which one to keep - if the table was 50 columns wide with varchar(max) columns then i'd use the windowing function every day of the week

    What about if this table would have over 1,000,000,000 records and you would need to delete 95% of them?

    Again, it does all depend...

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • exactly!!!

    what if you only needed to delete 1%... i was just wondering if there was any specific Windowing issue that you know of (like Jeff's stuff about recursive CTE)

    MVDBA

  • Approach can also depend on whether this is a one-off repair or something that needs to happen on a regular basis. It's also wise to avoid the following scenario:

    1. SELECT DISTINCT * FROM MyTable INTO #tmpMyTable

    2. TRUNCATE TABLE MyTable

    3. (Power outage)

    4. Er.....

  • Richard Warr (9/6/2012)


    Approach can also depend on whether this is a one-off repair or something that needs to happen on a regular basis. It's also wise to avoid the following scenario:

    1. SELECT DISTINCT * FROM MyTable INTO #tmpMyTable

    2. TRUNCATE TABLE MyTable

    3. (Power outage)

    4. Er.....

    That was just an example. In production you would do:

    1. SELECT DISTINCT * FROM MyTable INTO staging.tmpMyTable

    2. TRUNCATE TABLE MyTable

    3. (Power outage)

    4. Er.....

    5. Server Restart

    6. INSERT MyTable SELECT * FROM staging.tmpMyTable

    7. All fine here :hehe:

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • Eugene Elutin (9/6/2012)


    That was just an example. In production you would do:

    1. SELECT DISTINCT * FROM MyTable INTO staging.tmpMyTable

    2. TRUNCATE TABLE MyTable

    3. (Power outage)

    4. Er.....

    5. Server Restart

    6. INSERT MyTable SELECT * FROM staging.tmpMyTable

    7. All fine here :hehe:

    Yes, I do that now as well 😉

  • Richard Warr (9/6/2012)


    Approach can also depend on whether this is a one-off repair or something that needs to happen on a regular basis. It's also wise to avoid the following scenario:

    1. SELECT DISTINCT * FROM MyTable INTO #tmpMyTable

    2. TRUNCATE TABLE MyTable

    3. (Power outage)

    4. Er.....

    You could surround this in a transaction and upon recovery from the power outage, you would be back at step 1.

Viewing 7 posts - 16 through 21 (of 21 total)

You must be logged in to reply to this topic. Login to reply