Is there something like an UPSERT or a MERGE command in T-SQL?

  • My problem is that, we're loading a data from a staging table into a production table via cursors (Don't ask it was already developed before i got there) but anyways the problem is this cursor goes through the whole table (2 mil rows) and then checks to see if the table in prod needs to be updated if not then it inserts the record.  It goes through millions of table scans.  I have heard of Merge in oracle and Upsert in Db2 (i think) which basically looks at a record updates the record if needs to be updated else it just inserts the record.  Can this be achieved in T-SQL??? PLease help.  Thanks.

  • Looks like you want to think in terms of sets here.

    You can probably indentify the rows that need to be updated with a view (where they exist on bot hh sources, but at least one column has changed), and rows that need to be inserted with a view as well (using a not exists clause in the view...

     

     

  • No, there is no "UPDATE or INSERT if not exists" command in T-SQL.

  • --SET NOCOUNT ON

    --THIS PROCESS TOOK 1m47s--2m:05s(varies if you switch on traces)

    --MACHINE:LAPTOP

    --OS:WINDOWS XP INTEL(R) PENTIUM 4

    --CPU 1.60 GHZ 897 MHZ,

    --256 MB Of RAM(typically lower ram since i using other devices)

    --DROP TABLE #STAGING

    --DROP TABLE #PRODUCTION

    --TAKE NOTE IF YOU ARE DOING THIS IN A REGULAR BASES, LOOK-UP USING OLAP

    --QUERY IT'S REALLY ZOOOOMING....

    use northwind

    go

    --benchmarking performance

    DECLARE @COUNT INT

    DECLARE @MILLION INT

    DECLARE @RECORDS INT

    CREATE TABLE #STAGING

    (

      ProductID varchar(100) NULL,

      Qty      INT NULL

    )

    CREATE TABLE #PRODUCTION

    (

      ProductID varchar(100) NULL,

      Qty      INT NULL

    )

    SET @COUNT = 1

    SET @MILLION = 1000 --

    SET @RECORDS = 0

    --lets just fill #STAGING with nonsense data

    --a good around 1 million records plus or more

    --this select statement from northwind should return around 2155 record

    --we loop this around 1000 to get us around 2,155,000 million data

    PRINT 'START TIME: ' + CONVERT(VARCHAR(30), GETDATE(), 109)

    SET NOCOUNT ON

    WHILE @COUNT <= @MILLION

    BEGIN

     INSERT INTO #STAGING

     SELECT @COUNT + CAST(ORDERID AS VARCHAR(10)) + CAST(PRODUCTID AS VARCHAR(10)) ProductID,

       QUANTITY  

     FROM [ORDER DETAILS]

     SET @RECORDS = @RECORDS + @@ROWCOUNT

    SET @COUNT = @COUNT + 1

    END

    SET NOCOUNT OFF

    PRINT 'END TIME: ' + CONVERT(VARCHAR(30), GETDATE(), 109)

    PRINT 'TOTAL RECORDS INSERTED: ' + CONVERT(VARCHAR(30), @RECORDS)

    PRINT ' '

     

    --BTW:I ASSUMED THAT YOU ARE NOT PERFORMING

    --DATA CHECKING/VALIDATION ANYMORE AND JUST WANT

    --TO PERFORM A INSERT INTO THE PRODUCTION TABLE

    PRINT 'START TIME(INSERT -->PRODUCTION): ' + CONVERT(VARCHAR(30), GETDATE(), 109)

    INSERT INTO #PRODUCTION

    SELECT s.PRODUCTID, NULL

    FROM #STAGING s

    WHERE s.PRODUCTID NOT IN (SELECT p.PRODUCTID FROM #PRODUCTION p)

    PRINT 'END TIME(INSERT -->PRODUCTION): ' + CONVERT(VARCHAR(30), GETDATE(), 109)

    PRINT ' '

    --THIS PART ASSUMES YOU WANT TO UPDATE #PRODUCTION DATA

    --WITH THE #STAGING DATA. THIS HAS BEEN SIMPLIFIED

    --JUST TO SHOW THIS T-SQL

    PRINT 'START TIME(UPDATE -->PRODUCTION): ' + CONVERT(VARCHAR(30), GETDATE(), 109)

    --do updates in groups and not in bulk

    --this is more faster

    UPDATE #PRODUCTION

    SET #PRODUCTION.Qty = s.QTY

    FROM #STAGING s, #PRODUCTION p

    WHERE s.PRODUCTID = p.PRODUCTID AND s.PRODUCTID like '10309%'

    PRINT 'END TIME(UPDATE -->PRODUCTION): ' + CONVERT(VARCHAR(30), GETDATE(), 109)

    PRINT ' '

    --

    --************************************************************

    --TEST BY PERFORMING INSERT AGAIN

    --THERE SHOULD BE NO INSERT MADE SINCE RECORDS ALREADY EXISTS

    --IN #PRODUCTION

    --REMOVE THE /* */ TO TEST FOR CONDITIONS

    --*************************************************************

    /*

    --TEST CONDITION Q.A 1

    PRINT 'START TIMENO RECORDS INSERTED)' + CONVERT(VARCHAR(30), GETDATE(), 109)

    INSERT INTO #PRODUCTION

    SELECT s.PRODUCTID, NULL

    FROM #STAGING s

    WHERE s.PRODUCTID NOT IN (SELECT p.PRODUCTID FROM #PRODUCTION p)

    PRINT 'END TIMENO RECORDS INSERTED)' + CONVERT(VARCHAR(30), GETDATE(), 109)

    PRINT ' '

    */

    /*

    --THIS WILL PERFORM NEW INSERTS INTO STAGING AND ONLY

    --THESE NEW INSERTS WILL BE INSERTED INTO THE PRODUCTION

    --TEST CONDITION Q.A 2

    PRINT 'START TIME: ' + CONVERT(VARCHAR(30), GETDATE(), 109)

    SET NOCOUNT ON

    WHILE @MILLION > (@MILLION + 5)

    BEGIN

     INSERT INTO #STAGING

     SELECT @COUNT + CAST(ORDERID AS VARCHAR(10)) + CAST(PRODUCTID AS VARCHAR(10)) ProductID,

       QUANTITY  

     FROM [ORDER DETAILS]

     SET @RECORDS = @RECORDS + @@ROWCOUNT

    SET @COUNT = @COUNT + 1

    END

    SET NOCOUNT OFF

    PRINT 'END TIME: ' + CONVERT(VARCHAR(30), GETDATE(), 109)

    PRINT 'TOTAL RECORDS INSERTED: ' + CONVERT(VARCHAR(30), @RECORDS)

    PRINT 'START TIMEINSERT NEW RECORDS) ' + CONVERT(VARCHAR(30), GETDATE(), 109)

    INSERT INTO #PRODUCTION

    SELECT s.PRODUCTID, NULL

    FROM #STAGING s

    WHERE s.PRODUCTID NOT IN (SELECT p.PRODUCTID FROM #PRODUCTION p)

    PRINT 'END TIMEINSERT NEW RECORDS) ' + CONVERT(VARCHAR(30), GETDATE(), 109)

    PRINT ' '

    */

    --**************************************************************

    --SET NOCOUNT OFF

    SELECT COUNT(*) FROM #STAGING where productid like '10309%'

    SELECT top 10 * FROM #PRODUCTION where productid like '10309%'

    --btw i give no warranty against performance. just play around with it

    this query ran against my machine about 1m.47s. pretty bad i know. i should be 30 seconds or less on a production machine, with loads of RAM

  • There is nothing like the UPSERT or MERGE command in SQL Server 2000 T-SQL.  You might like to know that it is in SQL Server 2005.

  • " You might like to know that it is in SQL Server 2005. "

    Do you have an example of the syntax?

  • Why not create your own UPSERT T-SQL stored procedure for your data?  It's simply the combination of an update and insert in a single stored procedure.

    CREATE PROCEDURE spUpsertWuc

    AS

     INSERT INTO tblWUC (WUC)

     SELECT DISTINCT new.WUC

     FROM tblWUC_New new

     LEFT OUTER JOIN tblWUC wuc

     ON new.WUC=wuc.WUC

     WHERE wuc.WUC IS NULL

     UPDATE wuc SET

      wuc.WUC = new.WUC

     FROM tblWUC wuc

     INNER JOIN tblWUC_New new

     ON wuc.ID=new.ID

     

     

  • I was just hoping to do a single check against the db -- inserting/updating thousands of rows at a time.

  • BUT, if you don't use a cursor and use the INSERT and UPDATE commands it will process hundreds of times faster.  The T-SQL shown doesn't and processes extremely fast.

  • Right approach.

    You only need to change the order of statements:

    UPDATE first, INSERT second

    Otherwise you update just inserted rows.

    _____________
    Code for TallyGenerator

  • thank you

  • This is specfic for 2005:

    It is IMPORTANT to perform the update first:

    -- UPDATE Destination FROM (Source INTERSECT Destination)

    -- INSERT INTO Destination FROM (Source EXCEPT Destination)

     

    -- INTERSECT (select the key value)

    SELECT

     Key(s) FROM ETL.dbo.MyCurr

    INTERSECT

    SELECT

    Key(s) FROM ETL.dbo.MyPrevious

     

     

     

    -- EXCEPT

    SELECT

    Key(s) FROM ETL.dbo.MyCurr

    EXCEPT

    SELECT

    Key(s) FROM ETL.dbo.MyPrevious

     

    HTH's

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply