Maintaining Sequence of OUTPUT rows during INSERT

Question

Maintaining Sequence of OUTPUT rows during INSERT

The Dixie Flatline

SSC Guru

Points: 53354
More actions
July 6, 2016 at 3:41 pm

#327596

We are writing a stored procedure that will be doing an insert to one table followed by an update to another table. (Edited for clarification.)
The update will set the value of a target column to the newly created identity value(s) from the insert.
The key to the table to be updated is a GUID which does not exist in the table to be inserted into.
(Please do not lecture me about GUIDs being keys at this point in time. I understand and agree.)
Therefore, we have to be able to join the results of the OUTPUT back to the original table-valued parameter to get the GUID so we can subsequently do the update.
I have created the following solution using "RowIDs" to preserve the sequence of rows in the table. It is working in early testing but I can't find documentation that the OUTPUT rows will always be in the same sequence as the ORDER BY clause of the insert. This is critical or else the RowIDs of the output will not match the RowIDs of the original table-valued parameter.
Can anyone either break this, or point me to documentation which establishes this is reliable? Or, alternatively, suggest another technique?
Many thanks in advance.
- Bob
CREATE TABLE #COLORS (ColorID int identity(1,1) primary key, Color varchar(20))
insert into #colors values ('Black'),('White')
DECLARE @ColorsToAdd Table (RowID int identity(1,1) primary key, RandomGUID uniqueIdentifier, ColorToAdd varchar(20))
DECLARE @AddedColors Table (RowID int identity(1,1) primary key, AddedColorID int, AddedColor varchar(20))
insert into @ColorsToAdd(RandomGUID, ColorToAdd)
select NewID(),'Pink' union all
select NewID(),'Aqua' union all
select NewID(),'Teal' union all
select NewID(),'Gold'
select '@ColorsToAdd' as [Table],* from @ColorsToAdd
insert into #colors (color)
output inserted.ColorID, inserted.color
into @AddedColors(AddedColorID, AddedColor)
select ColorToAdd
from @colorsToAdd
order by RowID-- DOES THIS CONTRL THE SEQUENCE OF THE OUTPUT CLAUSE ???
select '@AddedColors' as [Table],* from @AddedColors
-- confirm id/color didn't change
if exists (select 1
from @ColorsToAdd ca
join @AddedColors ac on ac.RowID = ca.rowid
where AddedColor <> ColorToAdd )
begin
select 'Sequence has been lost' as [Error]
end
else
begin
select top (100) ca.*, ac.AddedColorID, ac.AddedColor
from @ColorsToAdd ca
join @AddedColors ac on ca.RowID = ac.RowID
end
-- From here, another table will be updated by matching on RandomGuid
-- and updating the target column with the value of AddedColorID
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1004409 More actions · Answer 1

This is the ol' batch Invoice/Invoice detail problem. I've tried many different renditions, including the very one that you've just tried and I'm here to tell you that it will eventually fail, like it did for me.

The only way to do this in a guaranteed fashion is to add the GUIDs as a part of the insert to the IDENTITY table. Of course, that also means adding a column to hold the GUID but, if you need to do such things in a batch fashion other than RBAR, then adding the GUID column is absolutely the only safe way to do it.

If you can't add the column, then you're going to have to resort to RBAR.

I'm not the only one that thinks this. Conor Cunningham (the T_SQL "god" from Microsoft) has made himself very clear on this subject more than once on MVP email... it is NOT guaranteed that the order from OUTPUT means anything.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

The Dixie Flatline SSC Guru Points: 53354 More actions · Answer 2

Found a compromise that looks to be bulletproof. The target table for the inserts is going to get huge, so we didn't want to store GUIDs in it for a one-time use, but we can probably live with adding two bytes for a smallint. (The insert batches will never be over 20,000 rows at a time, so a smallint will serve.)

Once that column is added, we just insert the original RowID, so it can now be included in the OUTPUT clause and used to join back to the original insert data.

Another potential solution would be to use a Sequence object instead of identity column in the target insert table. We rejected that because a lot of developers are using Entity Framework and our ability to enforce their use of stored procedures (or the proper sequence object) was questionable.

Thanks for the timely feedback, Jeff.

CREATE TABLE #COLORS (ColorID int IDENTITY(1,1) PRIMARY key, Color varchar(20), NefariousID SMALLINT)

-- entity framework inserts

INSERT INTO #COLORS( Color )

SELECT 'Black' UNION ALL

SELECT 'White'

DECLARE @ColorsToAdd Table (RowID smallint identity(1,1) primary key, RandomGUID uniqueIdentifier, ColorToAdd varchar(20))

DECLARE @AddedColors Table (AddedColorID int, AddedColor varchar(20), NefariousID SMALLINT PRIMARY KEY)

insert into @ColorsToAdd(RandomGUID, ColorToAdd)

select NewID(),'Pink' union all

select NewID(),'Aqua' union all

select NewID(),'Teal' union all

select NewID(),'Gold'

select '#Colors' as [Table],*

FROM #COLORS

select '@ColorsToAdd' as [Table],*

FROM @ColorsToAdd

insert into #colors (Color, NefariousID)

output inserted.ColorID, inserted.Color, inserted.NefariousID

into @AddedColors(AddedColorID, AddedColor, NefariousID)

select ColorToAdd, RowID

from @colorsToAdd

order by RowID-- DOES THIS CONTRL THE SEQUENCE OF THE OUTPUT CLAUSE ???

select '@AddedColors' as [Table],* from @AddedColors

-- confirm id/color didn't change

if exists (select 1

from @ColorsToAdd ca

join @AddedColors ac on ac.NefariousID = ca.rowid

where AddedColor <> ColorToAdd )

begin

select 'Sequence has been lost' as [Error]

end

else

begin

select top (100) ca.*, ac.AddedColorID, ac.AddedColor

from @ColorsToAdd ca

join @AddedColors ac on ca.RowID = ac.NefariousID

end

-- From here, another table will be updated by matching on RandomGuid

-- and updating the target column with the value of AddedColorID

__________________________________________________

Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 3

1) you asked to not be lectured on GUIDs, so I won't do that.

2) I WILL lecture you on the design choice of doing a partial insert and then updating the same rows after the fact. Horribly bad choice because it immediately makes every row larger than it was (and also has a NULL column with it's own overhead), leading to page splits galore. I tell devs to jump through hoops to avoid doing this.

3) Can't this be done by simply selecting the just-added rows from the colors table? Note that depending on your concurrency needs you may need to be very rigorous with your isolation level/locking hints to avoid collisions.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

The Dixie Flatline SSC Guru Points: 53354 More actions · Answer 4

I am insulted that you would dare accuse me of such a travesty!! :w00t:

Just kidding.

We aren't updating the same table after an insert. We are inserting into table A, and then updating table B.

Sorry if my explanation of the problem wasn't clear enough. I went back and changed the original post to try to clarify it.

__________________________________________________

Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 5

The Dixie Flatline (7/7/2016)
I am insulted that you would dare accuse me of such a travesty!! :w00t:
Just kidding.
We aren't updating the same table after an insert. We are inserting into table A, and then updating table B.
Sorry if my explanation of the problem wasn't clear enough. I went back and changed the original post to try to clarify it.

HAH - good one!

So then the question becomes can you hold off creating the B-table rows until you have acquired all that you need to populate them without doing a post-insert update?

And what about getting the data you need from the inserted-to table after the insert? Assuming you can't create duplicates then this would be an easy (efficient index seek??) thing to do. If you can create dupes you need some form of "get the latest of each value" predicate added into the mix.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

drew.allen SSC Guru Points: 76997 More actions · Answer 6

If you change your INSERT to a MERGE, then you can just OUTPUT the original GUID. Of course, the MERGE introduces it's own problems.

Drew

J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA

The Dixie Flatline SSC Guru Points: 53354 More actions · Answer 7

drew.allen (7/7/2016)
If you change your INSERT to a MERGE, then you can just OUTPUT the original GUID. Of course, the MERGE introduces it's own problems.
Drew

I'm not following you, Drew. How is the output clause different with a MERGE?

__________________________________________________

Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

The Dixie Flatline SSC Guru Points: 53354 More actions · Answer 8

TheSQLGuru (7/7/2016)
So then the question becomes can you hold off creating the B-table rows until you have acquired all that you need to populate them without doing a post-insert update?
And what about getting the data you need from the inserted-to table after the insert? Assuming you can't create duplicates then this would be an easy (efficient index seek??) thing to do. If you can create dupes you need some form of "get the latest of each value" predicate added into the mix.

Let me try this again:

The procedure is being passed a table-valued parameter. In my example above, this corresponds to @ColorsToAdd.

The procedure will insert some data from this tvparm into Table A (#Colors in my example).

-- The inserts can create duplicate values for all columns except the identity/primary key column.

Subsequently, a column in Table B needs to be updated with the new identity value(s) from Table A (#colors.[ColorID]).

The join to Table B must based on the GUID value from @ColorsToAdd. We do NOT want to store that GUID in #Colors.

The problem is to get the new identity values from the #Colors inserts associated back with the GUIDs from @ColorsToAdd.

Once that is accomplished we can proceed to the update query.

__________________________________________________

Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

Stephanie Giovannini SSCertifiable Points: 7422 More actions · Answer 9

I've updated your most recent test script. The MERGE statement allows more flexibility in the OUTPUT clause than other DML statements. This allows the RowID to be passed from one table variable to another, while being associated with the identity value, and not having to be stored in a permanent table.

CREATE TABLE #COLORS (ColorID int IDENTITY(1,1) PRIMARY key, Color varchar(20))

-- entity framework inserts

INSERT INTO #COLORS( Color )

SELECT 'Black' UNION ALL

SELECT 'White'

DECLARE @ColorsToAdd Table (RowID smallint identity(1,1) primary key, RandomGUID uniqueIdentifier, ColorToAdd varchar(20))

DECLARE @AddedColors Table (AddedColorID int, AddedColor varchar(20), NefariousID SMALLINT PRIMARY KEY)

insert into @ColorsToAdd(RandomGUID, ColorToAdd)

select NewID(),'Pink' union all

select NewID(),'Aqua' union all

select NewID(),'Teal' union all

select NewID(),'Gold'

select '#Colors' as [Table],*

FROM #COLORS

select '@ColorsToAdd' as [Table],*

FROM @ColorsToAdd

MERGE #colors as tgt

USING (select ColorToAdd, RowID from @colorsToAdd) AS src

ON 1 = 0

WHEN NOT MATCHED THEN

INSERT (Color) VALUES (src.ColorToAdd)

output inserted.ColorID, inserted.Color, src.RowID

into @AddedColors(AddedColorID, AddedColor, NefariousID);

select '@AddedColors' as [Table],* from @AddedColors

-- confirm id/color didn't change

if exists (select 1

from @ColorsToAdd ca

join @AddedColors ac on ac.NefariousID = ca.rowid

where AddedColor <> ColorToAdd )

begin

select 'Sequence has been lost' as [Error]

end

else

begin

select top (100) ca.*, ac.AddedColorID, ac.AddedColor

from @ColorsToAdd ca

join @AddedColors ac on ca.RowID = ac.NefariousID

end

-- From here, another table will be updated by matching on RandomGuid

-- and updating the target column with the value of AddedColorID

drew.allen SSC Guru Points: 76997 More actions · Answer 10

The Dixie Flatline (7/7/2016)
drew.allen (7/7/2016)
If you change your INSERT to a MERGE, then you can just OUTPUT the original GUID. Of course, the MERGE introduces it's own problems.
Drew
I'm not following you, Drew. How is the output clause different with a MERGE?

With an INSERT, you can only reference the pseudo-tables INSERTED and DELETED, but with MERGE you can also access any of the other tables involved in the query.

Drew

J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA

The Dixie Flatline SSC Guru Points: 53354 More actions · Answer 11

Drew, that sounds like the answer to a maiden's prayer, and I'd like to remember it for future reference, but after reading your note I looked at the MSDN pages for both OUTPUT and MERGE and didn't find an example.

Could you post an example or point me to a URL online?

__________________________________________________

Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

The Dixie Flatline SSC Guru Points: 53354 More actions · Answer 12

Found one: http://sqlblog.com/blogs/rob_farley/archive/2012/06/12/merge-gives-better-output-options.aspx

Stefanie, not sure why I missed reading your message with the sample code I wanted. But thanks to you as well.

Thanks to you all.

__________________________________________________

Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

drew.allen SSC Guru Points: 76997 More actions · Answer 13

The Dixie Flatline (7/7/2016)
Drew, that sounds like the answer to a maiden's prayer, and I'd like to remember it for future reference, but after reading your note I looked at the MSDN pages for both OUTPUT and MERGE and didn't find an example.
Could you post an example or point me to a URL online?

Yes, the MSDN documentation on it isn't very clear, but you can find it on the OUTPUT Clause page in the section on Arguments.

INSERTED
Is a column prefix that specifies the value added by the insert or update operation. Columns prefixed with INSERTED reflect the value after the UPDATE, INSERT, or MERGE statement is completed but before triggers are executed.

from_table_name
Is a column prefix that specifies a table included in the FROM clause of a DELETE, UPDATE, or MERGE statement that is used to specify the rows to update or delete.

Also the examples include using from_table_name in an UPDATE (E) and DELETE (F). Since MERGE is a combination of INSERT/UPDATE/DELETE, it has characteristics of all of them, which is probably why a MERGE-specific example wasn't included.

Drew

Edited to include information on examples.

J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA

The Dixie Flatline SSC Guru Points: 53354 More actions · Answer 14

Yes, the MSDN documentation on it isn't very clear, but you can find it on the OUTPUT Clause page in the section on Arguments.

I see it in the syntax section.

from_table_name
Is a column prefix that specifies a table included in the FROM clause of a DELETE, UPDATE, or MERGE statement that is used to specify the rows to update or delete.

But not an INSERT (sigh).

Also the examples include using from_table_name in an UPDATE (E) and DELETE (F). Since MERGE is a combination of INSERT/UPDATE/DELETE, it has characteristics of all of them, which is probably why a MERGE-specific example wasn't included.

Makes sense. Using a fromtable qualifier didn't work when we originally wrote the INSERT. I'm curious what is so different about that, but we have our answer now.

Thanks again, Drew.

__________________________________________________

Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills