Merge statement not working as expected for bulk load.

  • Hi,

    Am using merge sp to insert or update records  in target table.Below is the sp

    MERGE Products AS TARGET

    USING UpdatedProducts AS SOURCE

    ON (TARGET.ProductName= SOURCE.ProductName)

    --When records are matched, update the records if there is any change

    WHEN MATCHED

    THEN UPDATE SET TARGET.Rate = SOURCE.Rate

    --When no records are matched, insert the incoming records from source table to target table

    WHEN NOT MATCHED BY TARGET

    THEN INSERT (ProductID, ProductName, Rate) VALUES (SOURCE.ProductID, SOURCE.ProductName, SOURCE.Rate)

    In my source table we have two records

    Source table :

    ProductID  ProductName Rate

    101               TEA                 10.00

    102             TEA              25.00

     

    Target table: empty table

    ProductID  ProductName Rate

    try to move source records to target with merge sp. but two records are inserted.

    ProductID  ProductName Rate

    101               TEA                 10.00

    102             TEA              25.00

    It should be one record in target table. because first record will match with second record.

    Expected result.

    ProductID  ProductName Rate

    101                        TEA          25.00

     

    Can you please help,How merge statement pass records to  condition(ON (TARGET.ProductName= SOURCE.ProductName) one by one or bulk?If one by one, Why second record also inserted in target table.

  • I would guess that it's because ProductName is not unique in the table

    Try changing your match criteria from this

    (TARGET.ProductName= SOURCE.ProductName)

    to this

    (TARGET.ProductId= SOURCE.ProductId)

    • This reply was modified 4 years, 11 months ago by  Phil Parkin. Reason: Fix typo

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • You will find you get much better performance if you split the merge into two separate statements (an update followed by an insert).

    Performance Tip: The conditional behavior described for the MERGE statement works best when the two tables have a complex mixture of matching characteristics. For example, inserting a row if it doesn't exist, or updating a row if it matches. When simply updating one table based on the rows of another table, improve the performance and scalability with basic INSERT, UPDATE, and DELETE statements. For example:

    INSERT tbl_A (col, col2)  
    SELECT col, col2
    FROM tbl_B
    WHERE NOT EXISTS (SELECT col FROM tbl_A A2 WHERE A2.col = tbl_B.col);

    https://docs.microsoft.com/en-us/sql/t-sql/statements/merge-transact-sql?view=sql-server-ver15

  • I tried with (TARGET.ProductId= SOURCE.ProductId) . But same behavior two records are inserted. if we re-execute the sp. that time it will updating. Please hep on this if any other way to pass one by one records.

  • Is it really necessary to use a merge statement?  Why not something like this?

    DROP TABLE IF EXISTS Test_Products;
    DROP TABLE IF EXISTS Test_UpdatedProducts;
    GO
    CREATE TABLE Test_Products(ProductIDint, ProductName varchar(10), Rate int);
    CREATE TABLE Test_UpdatedProducts(ProductIDint, ProductName varchar(10), Rate int);
    GO

    INSERT Test_UpdatedProducts VALUES
    (101, 'TEA', 10.00),
    (102, 'TEA', 25.00);
    GO

    WITH UP_CTE AS(
    SELECT
    ProductID, ProductName, Rate,
    ROW_NUMBER() OVER (PARTITION BY ProductName ORDER BY ProductID DESC) AS RowNum
    FROM Test_UpdatedProducts UP)
    INSERT Test_Products(ProductID, ProductName, Rate)
    SELECT
    ProductID, ProductName, Rate
    FROM UP_CTE
    WHERE
    RowNum=1
    EXCEPT
    SELECT * FROM Test_Products;

    SELECT * FROM Test_Products;

     

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • it is working as expected - you are matching on product name - "TEA" - no record exists so both are inserted.

    Note that the error would happen even with a straight update as source would have 2 rows to update a single destination row so regardless of the method used you need to change how you process the source.

    sample to reproduce the error with merge

    if object_id('tempdb..#Products ') is not null
    drop table #Products;

    create table #Products
    ( ProductID int
    , productname varchar(10)
    , rate decimal(10, 2)
    )
    ;

    if object_id('tempdb..#UpdatedProducts ') is not null
    drop table #UpdatedProducts;

    create table #UpdatedProducts
    ( ProductID int
    , productname varchar(10)
    , rate decimal(10, 2)
    )
    ;

    insert into #UpdatedProducts
    select 101
    , 'tea'
    , 10;

    insert into #UpdatedProducts
    select 102
    , 'tea'
    , 25;

    merge #Products as target using #UpdatedProducts as source
    on (target.productname = source.productname)
    --When records are matched, update the records if there is any change
    when matched
    then update
    set target.rate = source.rate
    --When no records are matched, insert the incoming records from source table to target table
    when not matched by target
    then insert
    (ProductID
    , productname
    , rate
    )
    values (source.ProductID, source.productname, source.rate
    );

    /*
    do same merge again to prove that doing a merge like this, where there are 2 records on the source and we try and update the "same" record on the target
    */

    merge #Products as target using #UpdatedProducts as source
    on (target.productname = source.productname)
    --When records are matched, update the records if there is any change
    when matched
    then update
    set target.rate = source.rate

    --When no records are matched, insert the incoming records from source table to target table
    when not matched by target
    then insert
    (ProductID
    , productname
    , rate
    )
    values (source.ProductID, source.productname, source.rate
    );
    go
    /*
    Above fail with error as expected as it is trying to update the same record twice

    Msg 8672, Level 16, State 1, Line 55
    The MERGE statement attempted to UPDATE or DELETE the same row more than once. This happens when a target row matches more than one source row. A MERGE statement cannot UPDATE/DELETE the same row of the target table multiple times. Refine the ON clause to ensure a target row matches at most one source row, or use the GROUP BY clause to group the source rows.

    */

    /*
    if only one record for a single productname is supposed to exist on the target table then we need to filter the source
    how that is filtered depends on business rules - for this example I assumed that the higher product id is to be used
    */

    truncate table #Products

    merge #Products as target using
    (
    select t.ProductID
    , t.productname
    , t.rate
    from (
    select ProductID
    , productname
    , rate
    , row_number() over (partition by productname
    order by ProductID desc
    ) rownum
    from #UpdatedProducts
    ) t
    where t.rownum = 1

    ) as source
    on (target.productname = source.productname)
    --When records are matched, update the records if there is any change
    when matched
    then update
    set target.rate = source.rate

    --When no records are matched, insert the incoming records from source table to target table
    when not matched by target
    then insert
    (ProductID
    , productname
    , rate
    )
    values (source.ProductID, source.productname, source.rate
    );
    select *
    from #Products

    -- insert new record so we know update worked

    insert into #UpdatedProducts
    select 103
    , 'tea'
    , 225;

    merge #Products as target using
    (
    select t.ProductID
    , t.productname
    , t.rate
    from (
    select ProductID
    , productname
    , rate
    , row_number() over (partition by productname
    order by ProductID desc
    ) rownum
    from #UpdatedProducts
    ) t
    where t.rownum = 1

    ) as source
    on (target.productname = source.productname)
    --When records are matched, update the records if there is any change
    when matched
    then update
    set target.rate = source.rate

    --When no records are matched, insert the incoming records from source table to target table
    when not matched by target
    then insert
    (ProductID
    , productname
    , rate
    )
    values (source.ProductID, source.productname, source.rate
    );
    select *
    from #Products
  • For your example you would be much better off, in terms of performance, by having an update followed by an insert:

    UPDATE p 
    SET p.rate = up.rate
    FROM #Products p
    INNER JOIN #UpdatedProducts up
    ON up.productname = p.productname

    INSERT INTO #Products(ProductID, productname, rate)
    SELECT up.ProductID,
    up.productname,
    up.rate
    FROM #UpdatedProducts up
    WHERE NOT EXISTS(SELECT * FROM #Products p WHERE p.productname = up.productname)
  • Jonathan AC Roberts wrote:

    For your example you would be much better off, in terms of performance, by having an update followed by an insert:

    UPDATE p 
    SET p.rate = up.rate
    FROM #Products p
    INNER JOIN #UpdatedProducts up
    ON up.productname = p.productname

    INSERT INTO #Products(ProductID, productname, rate)
    SELECT up.ProductID,
    up.productname,
    up.rate
    FROM #UpdatedProducts up
    WHERE NOT EXISTS(SELECT * FROM #Products p WHERE p.productname = up.productname)

    I wholeheartedly agree with that.  I'll also suggest that since they're changing "rates" on products, they really should look into Type 6 SCDs, as well.  Temporal Tables make some pretty short work of that.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden wrote:

    Jonathan AC Roberts wrote:

    For your example you would be much better off, in terms of performance, by having an update followed by an insert:

    UPDATE p 
    SET p.rate = up.rate
    FROM #Products p
    INNER JOIN #UpdatedProducts up
    ON up.productname = p.productname

    INSERT INTO #Products(ProductID, productname, rate)
    SELECT up.ProductID,
    up.productname,
    up.rate
    FROM #UpdatedProducts up
    WHERE NOT EXISTS(SELECT * FROM #Products p WHERE p.productname = up.productname)

    I wholeheartedly agree with that.  I'll also suggest that since they're changing "rates" on products, they really should look into Type 6 SCDs, as well.  Temporal Tables make some pretty short work of that.

     

    update/insert does not error - but insert still adds the same 2 rows - and the update will set them both to the same value - not necessarily the correct one so the base logic error still remains.

  • frederico_fonseca wrote:

    Jeff Moden wrote:

    Jonathan AC Roberts wrote:

    For your example you would be much better off, in terms of performance, by having an update followed by an insert:

    UPDATE p 
    SET p.rate = up.rate
    FROM #Products p
    INNER JOIN #UpdatedProducts up
    ON up.productname = p.productname

    INSERT INTO #Products(ProductID, productname, rate)
    SELECT up.ProductID,
    up.productname,
    up.rate
    FROM #UpdatedProducts up
    WHERE NOT EXISTS(SELECT * FROM #Products p WHERE p.productname = up.productname)

    I wholeheartedly agree with that.  I'll also suggest that since they're changing "rates" on products, they really should look into Type 6 SCDs, as well.  Temporal Tables make some pretty short work of that.

    update/insert does not error - but insert still adds the same 2 rows - and the update will set them both to the same value - not necessarily the correct one so the base logic error still remains.

    I don't think the OP has explained very precisely what the problem is. Why do they only want ProductId 101? Why does this take presidency over ProductId 102? Why does the OP want 20.0 not 10.0 in the rate?

  • Jonathan AC Roberts wrote:

    I don't think the OP has explained very precisely what the problem is. Why do they only want ProductId 101? Why does this take presidency over ProductId 102? Why does the OP want 20.0 not 10.0 in the rate?

    Ah yes the OP wanted the ProductId 101 with the 102 Rate.  My code was not at all correct.  Clarification would help.  It all seems backwards really, why is there an update table and why is the Products table not updated directly?

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply