Where does the duplication come from?

Question

Post reply

Where does the duplication come from?

sqllearner-339367

Hall of Fame

Points: 3004
More actions
November 9, 2009 at 5:21 am

#213755

Hello Everyone,
I have a huge table for which im doing some transformation. To perform transformation, I consolidate some of the fields of source table and populate rest of the field as is (with just data type convertion) on the temporary table.
structure of temporary table
CREATE TABLE #tempworkingaddress
(
ugly_ID varchar(max) , -- consolidated field ,
email_address varchar(60),
first_line varchar(60),
second_line varchar(60),
third_line varchar(60),
fourth_line varchar(60),
city varchar(20),
zipcode varchar(10),
telephone varchar(15),
country varchar(25),
state varchar(25),
user_ID int,
[name] varchar(255)
)
SELECT UPPER(replace(LTRIM(RTRIM( ISNULL(address, ' ')+ ISNULL(address2, ' ') + ISNULL(town,' ') + ISNULL(region,' ') + ISNULL(postcode,' ') + ISNULL(country,' '))),' ','')) as ugly_ID,CONVERT(varchar(60),email),
CONVERT(varchar(60),address),CONVERT(varchar(60),address2),'','',CONVERT(varchar(20),town),CONVERT(VARCHAR(10),postcode),CONVERT (varchar(15),REPLACE(LTRIM(RTRIM(Phone)),' ','')),CONVERT(VARCHAR(25),LTRIM(RTRIM(country))),CONVERT(varchar(25),region)
,userID,CONVERT(VARCHAR(255),[Name]) FROM user
Above code snippet shows that im populating the temporary table with all conversion and UGLY_ID fiels is consolidation of some of the fields.
Now Im using the data populated on the temporary table to do actual table population.
INSERT INTO address(ugly_ID,email_address,
first_line,second_line,third_line,fourth_line,
city,zipcode,telephone,country,state,user_ID,[Name],address_type_id)
SELECT distinct ugly_ID,email_address,first_line,second_line,third_line,fourth_line,
city,zipcode,telephone,country,state,user_ID,[Name],1 from #tempworkingaddress
where ugly_id <> ''and ugly_id is not null
After running these t-sql, I expect the Address table should hold distinct row based on Ugly_ID column. Unfortunately there is a duplication on my destination table based on ugly_id.
To track this , I have already listed the rows on #temporary tbale for those problematic id's . There is no duplication found. Any suggestion welcome?
ta

Viewing 4 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply

sqllearner-339367 Hall of Fame Points: 3004 More actions · Answer 1

I have double checked that temporary table doesnt have any duplicated data by running this query

select user_id, count(ugly_id) from #tempworkingaddress

group by user_id

having count(ugly_id) > 1

and everytime when address table is populated different record are duplicated.

ie

select user_id, count(ugly_id) from address

group by user_id

having count(ugly_id) &gt ; 1

produces different set of rows....

Al-279884 Old Hand Points: 383 More actions · Answer 2

This may not be the problem with your query, but it is an incorrect assumption from your part if I understood it correctly. DISTINCT applies to all the columns in the SELECT statement, not just to the Ugly_ID column. Hence it would be possible to have two records with the same Ugly_ID if at least one of the other columns is different.

sqllearner-339367 Hall of Fame Points: 3004 More actions · Answer 3

Sorry for not making it clear at first attempt. Im aware that disticnt applies to all the columns. I have checked that ugly_ID and other columns are equal for one or more rows that giving raise to duplication.

ta