Merge statement using SCD2 failing

  • Hi,

    I have created mergse statement using SCD2.

    where I am inserting the data if my BBxkey is not matching with target and updating the rows if the bbxkey is matching and rowchecksum is different.

    Working of Store procedure

    There are 2 scenario covered in this procedure on the basis of that ETL happening.

    There are 2 columns deriving from source table at run time, one is BBxkey which is nothing but a combination of one or more column or part of column and another column is a Rowchecksum column which is nothing but a Hashvalue of all the column of the tables for a row.

    Merge case 1:-WHEN NOT MATCH THEN INSERT

    If source BBxkey is not there in Archive table that means if BBxKey is null then those records are new and it will directly inserted into Archive table.

    Merge case 2:-WHEN MATCH THEN UPDATE

    If Source.BBxkey=Target.BBxkey && Source.Rowchecksum<>Target.Rowchecksum then this means source records are available in Archive table but data has been changed, in this case it will update the old record with latestversion 0 and insert the new record with latestversion 1.

    my sp failing when source having more than 1 same bbxkey.

    error

    [Execute SQL Task] Error: Executing the query "EXEC dbo.ETL_STAGE_ARCHIVE ?" failed with the following error: "The MERGE statement attempted to UPDATE or DELETE the same row more than once. This happens when a target row matches more than one source row. A MERGE statement cannot UPDATE/DELETE the same row of the target table multiple times. Refine the ON clause to ensure a target row matches at most one source row, or use the GROUP BY clause to group the source rows.".

    Sample store procedure

    DECLARE @Merge_Out TABLE (Action_Taken varchar(8),

    TimeIn datetime,

    BBXKey varchar(100),

    RowChecksum nvarchar(4000),Col001 varchar(8000),Col002 varchar(8000),

    Col003 varchar(8000),Col004 varchar(8000),Col005 varchar(8000),

    Col006 varchar(8000),Col007 varchar(8000),Col008 varchar(8000),

    Col009 varchar(8000),Col010 varchar(8000), LatestVersion bit)

    MERGE [Archive].[DBO].[ArchiveBBXFBFLG] T

    USING (

    SELECT Col001,Col002,Col003,Col004,Col005,Col006,Col007,Col008,Col009,Col010,

    CAST(N'' AS XML).value('xs:base64Binary(xs:hexBinary(sql:column(''md5hash'')))', 'VARCHAR(MAX)') RowChecksum,

    SUBSTRING(Col001,1,27) bbxkey

    FROM (

    SELECT Col001,Col002,Col003,Col004,Col005,Col006,Col007,Col008,Col009,Col010,

    HASHBYTES('MD5',CAST(CONCAT(Col001,Col002,Col003,Col004,Col005,Col006,Col007,Col008,Col009,Col010,'') as nvarchar(max))) md5hash

    FROM [DataStaging].[DBO].[ImportBBxFBFLG] ) as tab ) S

    ON (S.BBxKey = T.BBxKey AND (T.LatestVersion = 1 OR T.LatestVersion IS NULL))

    [highlight=#ffff11]WHEN NOT MATCHED BY TARGET THEN [/highlight]

    INSERT (Col001,Col002,Col003,Col004,Col005,Col006,Col007,Col008,Col009,Col010,TimeIn ,BBXKey,RowChecksum,LatestVersion)

    VALUES (Col001,Col002,Col003,Col004,Col005,Col006,Col007,Col008,Col009,Col010,getdate(),BBXKey,RowChecksum,1)

    [highlight=#ffff11]WHEN MATCHED AND T.latestversion = 1 AND s.RowChecksum <> T.RowChecksum[/highlight] THEN

    UPDATE SET T.LatestVersion = 0

    OUTPUT $ACTION Action_Taken, GETDATE() TimeIn,S.BBXKey,S.RowChecksum,S.Col001,S.Col002,S.Col003,S.Col004,S.Col005,S.Col006,S.Col007,S.Col008,S.Col009,S.Col010,1 AS LatestVersion

    INTO @MERGE_OUT ;INSERT INTO [Archive].[DBO].ArchiveBBXFBFLG(TimeIn ,BBXKey,RowChecksum,LatestVersion,Col001,Col002,Col003,Col004,Col005,Col006,Col007,Col008,Col009,Col010)

    SELECT TimeIn ,BBXKey,RowChecksum,LatestVersion,Col001,Col002,Col003,Col004,Col005,Col006,Col007,Col008,Col009,Col010 from @merge_out

    where Action_Taken = 'UPDATE';

    How Can I avoid such failure of my sp.

    I WANT TO HANDLE THOSE CASE WHERE S.bbxkey=T.bbxkey && s.rowchecksum=t.rowchecksum

    please suggest me .

    Regards,

    Vipin jha

  • It seems that you have a problem with your source data set, so start the Investigation there as the Merge key should be unique.

    The script below should show you the rows that you have with duplicated keys (you might need to jig it about a bit).

    WITH CTE

    AS

    (

    SELECT Col001,Col002,Col003,Col004,Col005,Col006,Col007,Col008,Col009,Col010,

    CAST(N'' AS XML).value('xs:base64Binary(xs:hexBinary(sql:column(''md5hash'')))', 'VARCHAR(MAX)') RowChecksum,

    SUBSTRING(Col001,1,27) bbxkey

    )

    ,CTE_Dupes

    AS

    (

    SELECT

    bbxKey

    ,Count(*)

    From

    CTE

    Group by bbxkey

    Having count(*)>1

    )

    SELECT

    CTE.*

    FROM

    CTE

    JOIN CTE_Dupes

    ON CTE_Dupes.bbxKey=CTE.BBxKey

    ORDER BY CTE.BbxKey

    However, my main concern is that you are using SUBSTRING on COL001 to create the 'Natural key' and from a DW perspective that rings a lot of alarm bells for me in regards to its suitability as a Natural Key.

    _________________________________________________________________________
    SSC Guide to Posting and Best Practices

  • I agree with Jason. This sounds like a data quality problem, not a MERGE problem. The error is quite explicit - you can't (shouldn't) have duplicate keys. Pretty simple actually.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply