February 7, 2004 at 6:37 pm
Hello!
I've inherited a SQL Server 2000 database that has a table with 2 columns, both are part of a clustered index that has been incorrectly set as non-unique. Of the 100 of thousands of rows in the table, about 8 have duplicate values.
I need to remove the duplicate rows before I can make the index unique. I understand the best way to remove the duplicate rows is to export the table to a temp table, remove the dup rows, then drop the orig table and rename the temp table as the orig?
Sound accurate?
Can anyone please give me help with the syntax..or lead me to a link? I can't quite seem to find what I'm looking for.
Thank you
February 8, 2004 at 1:30 pm
Why not simply delete the duplicates and then alter your index? I don't think that you'll need to export, drop.....
Have a look at the script section here. There are several scripts on eliminating dups.
--
Frank Kalis
Microsoft SQL Server MVP
Webmaster: http://www.insidesql.org/blogs
My blog: http://www.insidesql.org/blogs/frankkalis/[/url]
February 8, 2004 at 6:20 pm
Re: "a clustered index that has been incorrectly set as non-unique."
I hope that you do not mean that just because its clustered index, it must also be unique. You certainly can have a clustered non-unique index, and a unique non-clustered index on the table, and still be in good design.
Mark
February 9, 2004 at 6:12 am
Why not do something like a
select distinct * --often nicer to specify col names
from MyTable
into #MyTempTable
then truncate/delete all data from your original table and insert the now distinct data back in. You may wish to drop or monitor indices, etc, etc (comments anyone?)
Cheers
February 9, 2004 at 6:38 am
Why not simply delete the duplicates and then alter your index? |
And how do propose to do that Frank if the table has only two columns and the duplicates have exactly the same values?
Personally I would
Create a Copy table
Insert all the data into that table
Truncate the original table (quicker but no logging)
Alter the Index
Insert non duplicate data (as per ianyates example)
Far away is close at hand in the images of elsewhere.
Anon.
February 9, 2004 at 7:59 am
How about -- do a self-join on the table, with count(*), GROUP BY and HAVING count(*) > 1. This will give you the rows that are non-unique.
Delete those rows (WHERE xyx), and then reinsert them. There are only two columns, 8 rows, should be an easy process.
Then drop and recreate the index.
A temp table is not needed. If you want a backup, bcp out.
Regards, Melissa
February 9, 2004 at 1:06 pm
Here is a nice script I found a long time ago and used it many times:
delete from users_email where exists (
select u.colid, u.email from users_email u where
u.email = users_email.email
group by u.colid, u.email
having users_email.colid < max(u.colid))
order by email
Replace sample talbe/column names with yours, replace max() with min(), etc. based on your requirements, do the select before deleting to make sure you are about to delete what you need to delete and run it.
Julia
February 10, 2004 at 8:15 am
USE YourDB
GO
SELECT DISTINCT * INTO TempDB..YourUniqueTable FROM YourTable
GO
DROP INDEX YourTable.BadIndex
GO
TRUNCATE TABLE YourTable
GO
INSERT INTO YourTable SELECT * FROM TempDB.YourUniqueTable
GO
CREATE UNIQUE CLUSTERED INDEX IndexName ON YourTable(Column1Name, Column2Name)
GO
DROP TABLE Tempdb..YourUniqueTable
This should take only a few seconds to do.
February 15, 2004 at 1:54 pm
And how do propose to do that Frank if the table has only two columns and the duplicates have exactly the same values?
Ok, Dave, your solution is easier, quicker to implement....
--
Frank Kalis
Microsoft SQL Server MVP
Webmaster: http://www.insidesql.org/blogs
My blog: http://www.insidesql.org/blogs/frankkalis/[/url]
February 16, 2004 at 2:08 am
Ok, Dave, your solution is easier, quicker to implement |
No not really. After re reading the original post 8 duplicates out of 100,000 then you answer is half right but better than mine.
Copying the duplicates to a different table
Deleting the duplicates
Altering the index
Reinsert (removing duplicates)
would be the quickest
Far away is close at hand in the images of elsewhere.
Anon.
February 16, 2004 at 3:21 am
This is what i would use in such a case.
select col1,col2 into #abc from tbl
group by col1, col2
having count(*) > 1
delete tbl
from #abc
where tbl.col1 = #abc.col1 and tbl.col2 = #abc.col2
insert tbl
select * from #abc
-- Amit
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply