October 4, 2006 at 8:15 am
Hello all,
Quick question regarding creating a statement.
We import a great number of records through to a main table and due to business logic we have to import duplicates. At the end of the process I run a query to find any records that are duplicates....
/*the following query finds all the records brought through for a particular week where there
are duplicates. This could be because we have received the same file numerous times from them*/
SELECT Surname, Forename, NameOrderIndicator, MiddleNames,DateOfBirth,
Gender,FormerSurname, PreferredForname,
Ethnicity, DateSent, COUNT (*)
FROM dbo.Import_Changes_FULL
WHERE DateSent = '2006-09-29 00:00:00.000'
GROUP BY Surname, Forename, NameOrderIndicator, MiddleNames,DateOfBirth,
Gender,FormerSurname, PreferredForname, Ethnicity, DateSent HAVING Count (*)>1
ORDER BY Count (*)DESC
this shows how many duplicates of each record we have. Instead of going into each record manually and deleting is there any way we can
DELETE FROM dbo.Import_Changes_FULL, HAVING Count (*)>1
Leaving one unique record? I dont think this is possible but I thought I would check first.
October 4, 2006 at 8:55 am
If all the Columns are Dupe, then u can do this.
SELECT Surname, Forename, NameOrderIndicator, MiddleNames,DateOfBirth,
Gender,FormerSurname, PreferredForname,
Ethnicity, DateSent, COUNT (*) as Cts
into #tmp
FROM dbo.Import_Changes_FULL
WHERE DateSent = '2006-09-29 00:00:00.000'
GROUP BY Surname, Forename, NameOrderIndicator, MiddleNames,DateOfBirth,
Gender,FormerSurname, PreferredForname, Ethnicity, DateSent HAVING Count (*)>1
ORDER BY Count (*)DESC
Delete dbo.Import_Changes_FULL where exists
(Select 1 from #tmp where ltrim(rtrim(isnull(Import_Changes_FULL.Surname,''))) = ltrim(rtrim(isnull(#tmp.Surname,'')))
and ltrim(rtrim(isnull(Import_Changes_FULL.Forename,''))) = ltrim(rtrim(isnull(#tmp.Forename,'')))
and ltrim(rtrim(isnull(Import_Changes_FULL.NameOrderIndicator,''))) = ltrim(rtrim(isnull(#tmp.NameOrderIndicator,'')))
and ltrim(rtrim(isnull(Import_Changes_FULL.MiddleNames,''))) = ltrim(rtrim(isnull(#tmp.MiddleNames,'')))
and ltrim(rtrim(isnull(Import_Changes_FULL.DateOfBirth,'1/1/1900'))) = ltrim(rtrim(isnull(#tmp.DateOfBirth,'1/1/1900')))
and ltrim(rtrim(isnull(Import_Changes_FULL.Gender,''))) = ltrim(rtrim(isnull(#tmp.Gender,'')))
and ltrim(rtrim(isnull(Import_Changes_FULL.FormerSurname,''))) = ltrim(rtrim(isnull(#tmp.FormerSurname,'')))
....
and ltrim(rtrim(isnull(Import_Changes_FULL.DateSent,'1/1/1900'))) = ltrim(rtrim(isnull(#tmp.DateSent,'1/1/1900')))
 
Insert into Import_Changes_FULL(.....)
select .... from #tmp
If there is Primary key on the "Source" table then your process can be modified to "Import New" and "Update Existing".
Thanks
Sreejith
October 4, 2006 at 9:58 am
Rather than inserting all the records including the duplicates, might I suggest doing that into a staging table, then using this process:
1. Add a unique index to the "real" table so that dupes are not allowed in
2. Use this pattern to add the new, incoming data from the staging table to the real one:
Update realTable ... where exists ( select rows that match the staging table )
Insert into realTable select distinct ... where not exists ( select rows that match the staging table )
This will update existing records and insert new ones, with no oportunity for the duplicates to be created in the first place.
October 5, 2006 at 6:03 am
Brilliant. Its worked. Thanks loads for that. i honestly didnt think it could be done.
October 5, 2006 at 10:12 am
Which response worked? I have a similar situation, that I would like to test this on.
October 6, 2006 at 2:14 am
I went for the first one. I used the SELECT statement to select all my records into a temporary table where the records were duplicate and then I used the delete from statement.
What I found was after I created the SELECT into I did a sum on the COUNT (*) which gave me the total number of duplicates. Then when I ran the delete statement it said that that number of records was affected.
I grabbed a few names to check and after I ran the delete statement there was only 1 record in the file.
Hope it works for you!!!
Debbie
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply