September 10, 2016 at 6:56 pm
I have inherited a SQL 2000 database that the previous DBA archived data by removing the compound primary key used in the production table; unfortunately there are now thousands of duplicate records in the table. I'm trying to clean up the archived data so I've created a new table and added the compound primary key. I have tried coping the DISTINCT records into the new table, but get a violation of primary key constraint error - cannot insert duplicate key in object. I also tried doing a GROUP BY all on fields, and still receive the PK error.
I thought by using DISTINCT I would only receive unique records to insert into the new table but I'm unable to get past the PK constraint error. Any help you can provide would be greatly appreciate. Working in SQL Server 2000 Query Analyzer.
Thanks!!
September 14, 2016 at 4:16 pm
Shaira (9/10/2016)
I have inherited a SQL 2000 database that the previous DBA archived data by removing the compound primary key used in the production table; unfortunately there are now thousands of duplicate records in the table. I'm trying to clean up the archived data so I've created a new table and added the compound primary key. I have tried coping the DISTINCT records into the new table, but get a violation of primary key constraint error - cannot insert duplicate key in object. I also tried doing a GROUP BY all on fields, and still receive the PK error.I thought by using DISTINCT I would only receive unique records to insert into the new table but I'm unable to get past the PK constraint error. Any help you can provide would be greatly appreciate. Working in SQL Server 2000 Query Analyzer.
Thanks!!
So the rows really aren't duplicates if looking at all columns but they are duplicates based on the compound key. That's why distinct won't work. I would probably question at this point, which of the duplicates based on the compound key are the ones you need to keep and which ones should be deleted.
Any way, when using the group by for duplicates, you would select count(*), KeyColumn1, KeyColumn1...etc for just those columns that are part of the key and then group on those columns. And add having count(*) > 1 to find your duplicates.
You can find a bunch of examples in one this one post - a couple of them are useless to you as they used functions added in 2005 but most would work:
http://stackoverflow.com/questions/2112618/finding-duplicate-rows-in-sql-server
Sue
September 20, 2016 at 8:15 am
Thanks for response Sue! You are correct, the entire record was not a duplicate but a dup on the primary key.
I actually found an explanation to IGNORE_DUP_KEY (which I was amazed to see I could actually do), which lead me to figure out how to find the duplicate records - much like you explained. With that I was able to delete the records (he had added his name as CreatedBy vs the system name).
Thanks for the help ~ much appreciated!
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply