Duplicate key issue when Insert/Select

Question

Duplicate key issue when Insert/Select

tshad

SSCertifiable

Points: 5905
More actions
August 4, 2014 at 1:11 am

#297225

I have an issue where I am getting an error on an unique index.
I know why I am getting the error but not sure how to get around it.
The query does a check on whether a unique value exists in the Insert/Select. If I run it one record at a time (SELECT TOP 1...) it works fine and just won't update it if the record exists.
But if I do it in a batch, I get the error. I assume this is because it does the checking on the file before records are written out and then writes out the records one at a time from a temporary table.
It thinks all the records are unique because it compares the records one at a time to the original table (where there would be no duplicates). But it doesn't check the records against each other. Then when it actually writes out the record, the duplicate is there.
How do I do a batch where the Insert/Select would write out the records without the duplicates as it does when I do it one record at a time.
CREATE TABLE #TestTable
(
Name varchar(50),
Email varchar (40)
)
Insert #TestTable (Name,Email) Values('Tom', 'tom@aol.com')
Insert #TestTable (Name,Email) Values('Larry', 'larry@aol.com')
Insert #TestTable (Name,Email) Values('Mary', 'mary@aol.com')
Insert #TestTable (Name,Email) Values('Mary2', 'mary@aol.com')
CREATE TABLE #TestTable2
(
Name varchar(50),
Email varchar (40)
)
CREATE UNIQUE INDEX IX_TestTable2
ON #TestTable2(Email);
GO
Truncate table #TestTable2
-- This one gets an Error:
-- Cannot insert duplicate key row in object 'dbo.#TestTable2' with unique index 'IX_TestTable2'.
Insert #TestTable2 (Name, Email)
Select Name, Email
FROM #TestTable
WHERE Email not in (Select Email FROM #TestTable2)
-- Running these gets no error and only 3 get written to the file (as expected).
Truncate table #TestTable2
Insert #TestTable2 (Name, Email)
Select Top 1 Name, Email
FROM #TestTable
WHERE Email not in (Select Email FROM #TestTable2)
Insert #TestTable2 (Name, Email)
Select Top 1 Name, Email
FROM #TestTable
WHERE Email not in (Select Email FROM #TestTable2)
Insert #TestTable2 (Name, Email)
Select Top 1 Name, Email
FROM #TestTable
WHERE Email not in (Select Email FROM #TestTable2)
Insert #TestTable2 (Name, Email)
Select Top 1 Name, Email
FROM #TestTable
WHERE Email not in (Select Email FROM #TestTable2)
SELECT * from #TestTable
SELECT * FROM #TestTAble2
Thanks,
Tom

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply

Koen Verbeeck SSC Guru Points: 259105 More actions · Answer 1

You have to filter out "duplicate values" in your data set before you try to insert them into the destination table.

You can do this by assigning a row number using the ROW_NUMBER() function. You can assign row numbers inside a partition, which is in this case the Email column. Then you keep only rows where the row number is 1.

SELECT Name, Email

FROM

(

SELECT Name, Email, rid = ROW_NUMBER() OVER (PARTITION BY Email ORDER BY Name)

FROM #TestTable

) tmp

WHERE rid = 1;

Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP