"Inserting Using INSERT...SELECT" And Identities

  • Recently I have been writing queries using OPENXML to do inserts, updates, and deletes, as it lets me write extremely efficient queries that handle multiple row processing. (AFAIK, One of the drawbacks with updategrams is multi row processing, if incorrect someone please let me know) However, I have encountered several drawbacks to this, one of them being a t-sql limitation.

    Drawbacks / Limitations :

    A) Memory intensive

    B) Resource hungry, finite amount of document handles available for SQL server.

    C) Incapability to edit the xml from the OPENXML statement. Although OPENXML provides a rowset view, read this from BOL about the UPDATE statement :

    "rowset_function_limited - Is either the OPENQUERY or OPENROWSET function, subject to provider capabilities."

    D) Using INSERT...SELECT FROM OPENXML you can insert multiple rows, but if using an identity, t-sql does not provide a way to retrieve the many identities that were just entered. @@IDENTITY, SCOPE_IDENTITY, and IDENT_CURRENT all return the last identity entered.

    (D) is actually the one this post is about. Another developer and I have been discussing alternatives to get these Id's back, and we have come up with two possible solutions.

    A) With an insert trigger, create a temporary table named using the table name and spid, and insert the ids into it. When the INSERT...SELECT completes these ids can be pulled from the temp table, and the temp table dropped.

    Drawbacks : using temp tables, insert trigger ( slows down insert), memory intensive, dynamic sql to create the temp tables

    B) Add a uniqueidentifier column to the table. Lets call it [InsertGuid]. In the INSERT...SELECT you provide in the column list a uniqueidentifier ( value : NEWID ( ) ), that you can then return as an out variable of the stored proc, enabling the client to reselect that data back out, including the identities, or it could be selected back out for them.

    Drawbacks : Add a new column to each table where needed.

    Out of the two options we have come up with, (B) is by far the most attractive to us, but I would appreciate any alternatives / comments / suggestions. I would prefer to update the xml myself in the procedure, and return it back with the ids in it, but I am not sure that is even possible.

    Tim C.

    //Will write code for food


    Tim C //Will code for food

  • Wellnot to blow my own trumpet I did the OPENXML chapter for the book below. And I believe it is the only bokk that covers the issues of actually using OPENXML. The fact it is a ROWSET and so joining to it can be very bad. Memory is very interesting, to maximise it stop your server, stop everything else runnning on your server, I mean everything possible. Start SQL Server and the start everything else. SQL Server seems to base the maximum memory on that available at start up.

    On to your problem I cover this as well, my preferred solution is to use a table variable with an identity column.

    Populate the table then set isolation level to serializable, do a select max on you table to get the next identity.

    insert the data using SET IDENTITY_INSERT

    using the generater identity value + the MAX value.

    If you need to insert a second level related to the first, store the meta property @mp:id in the parent table and @mp:parentid in the child table. Using these you can then join the two table variables together on these columns to get the parent PK value.

    This works very well.

    Further more if you are using OPENXML you may be interested in

    http://www.sqlservercentral.com/scripts/scriptdetails.asp?scriptid=589

    Simon Sabin

    Co-author of SQL Server 2000 XML Distilled

    http://www.amazon.co.uk/exec/obidos/ASIN/1904347088


    Simon Sabin
    SQL Server MVP

    http://sqlblogcasts.com/blogs/simons

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply