Help with Export to Temp Table please

  • Hello!

    I've inherited a SQL Server 2000 database that has a table with 2 columns, both are part of a clustered index that has been incorrectly set as non-unique.  Of the 100 of thousands of rows in the table, about 8 have duplicate values.

    I need to remove the duplicate rows before I can make the index unique.  I understand the best way to remove the duplicate rows is to export the table to a temp table, remove the dup rows, then drop the orig table and rename the temp table as the orig?

    Sound accurate?

    Can anyone please give me help with the syntax..or lead me to a link?  I can't quite seem to find what I'm looking for.

    Thank you

     

     

  • Why not simply delete the duplicates and then alter your index? I don't think that you'll need to export, drop.....

    Have a look at the script section here. There are several scripts on eliminating dups.

     

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

  • Re: "a clustered index that has been incorrectly set as non-unique."

    I hope that you do not mean that just because its clustered index, it must also be unique.  You certainly can have a clustered non-unique index, and a unique non-clustered index on the table, and still be in good design.



    Mark

  • Why not do something like a


    select distinct *     --often nicer to specify col names

    from MyTable

    into #MyTempTable


    then truncate/delete all data from your original table and insert the now distinct data back in.  You may wish to drop or monitor indices, etc, etc (comments anyone?)

    Cheers 

  • quoteWhy not simply delete the duplicates and then alter your index?

     

    And how do propose to do that Frank if the table has only two columns and the duplicates have exactly the same values?

    Personally I would

    Create a Copy table

    Insert all the data into that table

    Truncate the original table (quicker but no logging)

    Alter the Index

    Insert non duplicate data (as per ianyates example)

    Far away is close at hand in the images of elsewhere.
    Anon.

  • How about -- do a self-join on the table, with count(*), GROUP BY and HAVING count(*) > 1.  This will give you the rows that are non-unique.

    Delete those rows (WHERE xyx), and then reinsert them.  There are only two columns, 8 rows, should be an easy process. 

    Then drop and recreate the index.

    A temp table is not needed.  If you want a backup, bcp out.

    Regards, Melissa

  • Here is a nice script I found a long time ago and used it many times:

    delete from users_email where exists (

      select u.colid, u.email from users_email u where

      u.email = users_email.email

    group by u.colid, u.email

    having users_email.colid < max(u.colid))

    order by email

    Replace sample talbe/column names with yours, replace max() with min(), etc. based on your requirements, do the select before deleting to make sure you are about to delete what you need to delete and run it.

    Julia

  • USE YourDB
    GO
    SELECT DISTINCT * INTO TempDB..YourUniqueTable FROM YourTable
    GO
    DROP INDEX YourTable.BadIndex
    GO
    TRUNCATE TABLE YourTable
    GO
    INSERT INTO YourTable SELECT * FROM TempDB.YourUniqueTable
    GO
    CREATE UNIQUE CLUSTERED INDEX IndexName ON YourTable(Column1Name, Column2Name)
    GO
    DROP TABLE Tempdb..YourUniqueTable
     
    This should take only a few seconds to do.
  • And how do propose to do that Frank if the table has only two columns and the duplicates have exactly the same values?

    Ok, Dave, your solution is easier, quicker to implement....

     

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

  • quoteOk, Dave, your solution is easier, quicker to implement

    No not really. After re reading the original post 8 duplicates out of 100,000 then you answer is half right but better than mine.

    Copying the duplicates to a different table

    Deleting the duplicates

    Altering the index

    Reinsert (removing duplicates)

    would be the quickest

     

    Far away is close at hand in the images of elsewhere.
    Anon.

  • This is what i would use in such a case.

    select col1,col2  into #abc from tbl

    group by col1, col2

    having count(*) > 1

    delete tbl

    from #abc

    where tbl.col1 = #abc.col1 and tbl.col2 = #abc.col2

    insert tbl

    select * from #abc

     

    -- Amit


    -- Amit



    "There is no 'patch' for stupidity."



    Download the Updated SQL Server 2005 Books Online.

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply