April 13, 2020 at 9:33 pm
Greetings All;
Per MS KB273586 (which is no longer available), it used to be the case that one could execute an "INSERT INTO... SELECT... ORDER BY" into a table with an IDENTITY column, and the IDENTITY column would reflect the ordering as specified in the SELECT portion. Repro:
USE msdb;
DROP TABLE IF EXISTS #MyColumns;
CREATE TABLE #MyColumns(
GeneratedSortKey int IDENTITY(1,1) NOT NULL,
ColumnName sysname
);
INSERT INTO #MyColumns(ColumnName)
SELECT c.name
FROM sys.columns c
WHERE OBJECT_NAME(c.object_id) = 'sysjobs'
ORDER BY c.column_id -- note: not represented in the destination table
;
-- Should be in original "column_id" order...
SELECT *
FROM #MyColumns
ORDER BY GeneratedSortKey;
-- Verification of ordering...
SELECT
mc.*,
c.column_id,
c.name,
[Diff] =
CASE WHEN mc.GeneratedSortKey != c.column_id
THEN '*****'
ELSE '-'
END
FROM #MyColumns mc
INNER JOIN sys.columns c
ON OBJECT_NAME(c.object_id) = 'sysjobs'
AND mc.ColumnName = c.name
ORDER BY mc.GeneratedSortKey
;
DROP TABLE IF EXISTS #MyColumns;
This technique was even described in some of Itzik's books from the 2010 era.
Now that the KB article is no longer available, can we assume that this behavior is now undefined?
Thanks.
April 14, 2020 at 2:45 pm
Quick google brought me to this page:
https://docs.microsoft.com/en-us/archive/blogs/sqltips/ordering-guarantees-in-sql-server
which states in point number 4:
INSERT queries that use SELECT with ORDER BY to populate rows guarantees how identity values are computed but not the order in which the rows are inserted
So the order for how the identity values are calculated are going to be in the order specified by the ORDER BY, but the order in which they are actually inserted may be different.
The above is all just my opinion on what you should do.
As with all advice you find on a random internet forum - you shouldn't blindly follow it. Always test on a test server to see if there is negative side effects before making changes to live!
I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.
April 14, 2020 at 3:52 pm
Hmmm... an even older reference than the 2010-era stuff I found, and on an unmaintained page. Not giving me much confidence that the behavior has been preserved and is still supported.
April 14, 2020 at 6:18 pm
It is still listed in the SQL Server 2019 documentation: https://docs.microsoft.com/en-us/sql/t-sql/statements/insert-transact-sql?view=sql-server-ver15
The point is that the IDENTITY values will be computed based on the order - but how that data is actually inserted into the table cannot be guaranteed. Even if the clustering key for that table is the IDENTITY column you could still end up with a fragmented table after the insert has completed. Besides, the order the data exists in the table is meaningless...
I would say that functionality has not changed.
Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”
― Charles R. Swindoll
How to post questions to get better answers faster
Managing Transaction Logs
April 15, 2020 at 6:15 pm
Thanks for the help, Jeffrey and Brian. This is the quote I was after:
"
INSERT queries that use SELECT with ORDER BY to populate rows guarantees how identity values are computed but not the order in which the rows are inserted.
"
If I'm going to leverage the behavior in a stored proc, I'll want to include a comment with the link to this MS document page so future developers understand the rationale and there's concrete documentation of the behavior.
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply