Where does the duplication come from?

  • Hello Everyone,

    I have a huge table for which im doing some transformation. To perform transformation, I consolidate some of the fields of source table and populate rest of the field as is (with just data type convertion) on the temporary table.

    structure of temporary table

    CREATE TABLE #tempworkingaddress

    (

    ugly_ID varchar(max) , -- consolidated field ,

    email_address varchar(60),

    first_line varchar(60),

    second_line varchar(60),

    third_line varchar(60),

    fourth_line varchar(60),

    city varchar(20),

    zipcode varchar(10),

    telephone varchar(15),

    country varchar(25),

    state varchar(25),

    user_ID int,

    [name] varchar(255)

    )

    SELECT UPPER(replace(LTRIM(RTRIM( ISNULL(address, ' ')+ ISNULL(address2, ' ') + ISNULL(town,' ') + ISNULL(region,' ') + ISNULL(postcode,' ') + ISNULL(country,' '))),' ','')) as ugly_ID,CONVERT(varchar(60),email),

    CONVERT(varchar(60),address),CONVERT(varchar(60),address2),'','',CONVERT(varchar(20),town),CONVERT(VARCHAR(10),postcode),CONVERT (varchar(15),REPLACE(LTRIM(RTRIM(Phone)),' ','')),CONVERT(VARCHAR(25),LTRIM(RTRIM(country))),CONVERT(varchar(25),region)

    ,userID,CONVERT(VARCHAR(255),[Name]) FROM user

    Above code snippet shows that im populating the temporary table with all conversion and UGLY_ID fiels is consolidation of some of the fields.

    Now Im using the data populated on the temporary table to do actual table population.

    INSERT INTO address(ugly_ID,email_address,

    first_line,second_line,third_line,fourth_line,

    city,zipcode,telephone,country,state,user_ID,[Name],address_type_id)

    SELECT distinct ugly_ID,email_address,first_line,second_line,third_line,fourth_line,

    city,zipcode,telephone,country,state,user_ID,[Name],1 from #tempworkingaddress

    where ugly_id <> ''and ugly_id is not null

    After running these t-sql, I expect the Address table should hold distinct row based on Ugly_ID column. Unfortunately there is a duplication on my destination table based on ugly_id.

    To track this , I have already listed the rows on #temporary tbale for those problematic id's . There is no duplication found. Any suggestion welcome?

    ta

  • I have double checked that temporary table doesnt have any duplicated data by running this query

    select user_id, count(ugly_id) from #tempworkingaddress

    group by user_id

    having count(ugly_id) > 1

    and everytime when address table is populated different record are duplicated.

    ie

    select user_id, count(ugly_id) from address

    group by user_id

    having count(ugly_id) &gt ; 1

    produces different set of rows....

  • This may not be the problem with your query, but it is an incorrect assumption from your part if I understood it correctly. DISTINCT applies to all the columns in the SELECT statement, not just to the Ugly_ID column. Hence it would be possible to have two records with the same Ugly_ID if at least one of the other columns is different.

  • Sorry for not making it clear at first attempt. Im aware that disticnt applies to all the columns. I have checked that ugly_ID and other columns are equal for one or more rows that giving raise to duplication.

    ta

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply