November 15, 2011 at 8:16 am
Dear damirmi
My simple question is how do you identify which are the number that have already used of. If i am not wrong all the usernumber that are in table are free to use...
then you can just use MIN() function to find the least unused number
Select Min(UserNumber) From Table
It find the least number from the table if there is anything that will let you know which are the number that are in use then keep that cateriea into where clause and use it.
Thanks & Regards
Syed Sami Ur Rehman
SQL-Server (Developer)
Hyderabad
Email-sami.sqldba@gmail.com
November 16, 2011 at 4:21 am
To bring this story to a temporarily end. I tried to verify the previous shown behaviour
with up to 500.000.000 Rows in #testEnvironment. (timeconsuming business !!, had to move transaction log
because its blowing up to about 100 GB)
But to my opinion, if you come to the situation to find unused Numbers out of 10E09 used ID's
there is something basically wrong with the concept. (well formed sequence Tables might be a better approach)
I intentionally only post the IO/CPU time results. The scrips are already well known here.
(Sorry for german output, but I think you can read it)
-- Temp Table with one missing Number: Total elapsed Time about 30 min
(100.000.000 Zeile(n) betroffen)
(1 Zeile(n) betroffen)
========== BASELINE ==========
(1 Zeile(n) betroffen)
SQL Server-Ausführungszeiten:
, CPU-Zeit = 0 ms, verstrichene Zeit = 0 ms.
========== CELKO's first Solution ==========
(1 Zeile(n) betroffen)
#testEnvironment00000000000B-Tabelle. Scananzahl 6, logische Lesevorgänge 321546, physische Lesevorgänge 0, Read-Ahead-Lesevorgänge 0, logische LOB-Lesevorgänge 0, physische LOB-Lesevorgänge 0, Read-Ahead-LOB-Lesevorgänge 0.
Worktable-Tabelle. Scananzahl 0, logische Lesevorgänge 0, physische Lesevorgänge 0, Read-Ahead-Lesevorgänge 0, logische LOB-Lesevorgänge 0, physische LOB-Lesevorgänge 0, Read-Ahead-LOB-Lesevorgänge 0.
SQL Server-Ausführungszeiten:
, CPU-Zeit = 192848 ms, verstrichene Zeit = 444526 ms.
========== TALLY TABLE Solution ==========
(1 Zeile(n) betroffen)
#testEnvironment00000000000B-Tabelle. Scananzahl 3, logische Lesevorgänge 160773, physische Lesevorgänge 0, Read-Ahead-Lesevorgänge 0, logische LOB-Lesevorgänge 0, physische LOB-Lesevorgänge 0, Read-Ahead-LOB-Lesevorgänge 0.
Worktable-Tabelle. Scananzahl 0, logische Lesevorgänge 0, physische Lesevorgänge 0, Read-Ahead-Lesevorgänge 0, logische LOB-Lesevorgänge 0, physische LOB-Lesevorgänge 0, Read-Ahead-LOB-Lesevorgänge 0.
SQL Server-Ausführungszeiten:
, CPU-Zeit = 16785 ms, verstrichene Zeit = 8810 ms.
========== ROW_NUMBER Solution ==========
(1 Zeile(n) betroffen)
#testEnvironment00000000000B-Tabelle. Scananzahl 4, logische Lesevorgänge 160773, physische Lesevorgänge 0, Read-Ahead-Lesevorgänge 0, logische LOB-Lesevorgänge 0, physische LOB-Lesevorgänge 0, Read-Ahead-LOB-Lesevorgänge 0.
SQL Server-Ausführungszeiten:
, CPU-Zeit = 50826 ms, verstrichene Zeit = 30194 ms.
========== CELKO's second Solution ==========
(1 Zeile(n) betroffen)
#testEnvironment00000000000B-Tabelle. Scananzahl 12, logische Lesevorgänge 643092, physische Lesevorgänge 0, Read-Ahead-Lesevorgänge 0, logische LOB-Lesevorgänge 0, physische LOB-Lesevorgänge 0, Read-Ahead-LOB-Lesevorgänge 0.
Worktable-Tabelle. Scananzahl 0, logische Lesevorgänge 0, physische Lesevorgänge 0, Read-Ahead-Lesevorgänge 0, logische LOB-Lesevorgänge 0, physische LOB-Lesevorgänge 0, Read-Ahead-LOB-Lesevorgänge 0.
SQL Server-Ausführungszeiten:
, CPU-Zeit = 423636 ms, verstrichene Zeit = 888386 ms.
-- Temp Table with Index: Total elapsed Time about 5 min
(46.666.668 Zeile(n) betroffen)
========== BASELINE ==========
(1 Zeile(n) betroffen)
#testEnvironment00000000000C-Tabelle. Scananzahl 1, logische Lesevorgänge 3, physische Lesevorgänge 0, Read-Ahead-Lesevorgänge 0, logische LOB-Lesevorgänge 0, physische LOB-Lesevorgänge 0, Read-Ahead-LOB-Lesevorgänge 0.
SQL Server-Ausführungszeiten:
, CPU-Zeit = 0 ms, verstrichene Zeit = 0 ms.
========== CELKO's first Solution ==========
(1 Zeile(n) betroffen)
#testEnvironment00000000000C-Tabelle. Scananzahl 2, logische Lesevorgänge 150614, physische Lesevorgänge 0, Read-Ahead-Lesevorgänge 0, logische LOB-Lesevorgänge 0, physische LOB-Lesevorgänge 0, Read-Ahead-LOB-Lesevorgänge 0.
SQL Server-Ausführungszeiten:
, CPU-Zeit = 20483 ms, verstrichene Zeit = 20747 ms.
========== TALLY TABLE Solution ==========
(1 Zeile(n) betroffen)
#testEnvironment00000000000C-Tabelle. Scananzahl 0, logische Lesevorgänge 200722, physische Lesevorgänge 0, Read-Ahead-Lesevorgänge 0, logische LOB-Lesevorgänge 0, physische LOB-Lesevorgänge 0, Read-Ahead-LOB-Lesevorgänge 0.
Worktable-Tabelle. Scananzahl 0, logische Lesevorgänge 0, physische Lesevorgänge 0, Read-Ahead-Lesevorgänge 0, logische LOB-Lesevorgänge 0, physische LOB-Lesevorgänge 0, Read-Ahead-LOB-Lesevorgänge 0.
SQL Server-Ausführungszeiten:
, CPU-Zeit = 187 ms, verstrichene Zeit = 113 ms.
========== ROW_NUMBER Solution ==========
(1 Zeile(n) betroffen)
#testEnvironment00000000000C-Tabelle. Scananzahl 2, logische Lesevorgänge 6, physische Lesevorgänge 0, Read-Ahead-Lesevorgänge 0, logische LOB-Lesevorgänge 0, physische LOB-Lesevorgänge 0, Read-Ahead-LOB-Lesevorgänge 0.
SQL Server-Ausführungszeiten:
, CPU-Zeit = 0 ms, verstrichene Zeit = 0 ms.
========== CELKO's second Solution ==========
(1 Zeile(n) betroffen)
#testEnvironment00000000000C-Tabelle. Scananzahl 4, logische Lesevorgänge 301228, physische Lesevorgänge 0, Read-Ahead-Lesevorgänge 0, logische LOB-Lesevorgänge 0, physische LOB-Lesevorgänge 0, Read-Ahead-LOB-Lesevorgänge 0.
SQL Server-Ausführungszeiten:
, CPU-Zeit = 40248 ms, verstrichene Zeit = 40310 ms.
-- Temp Table with Index Total elapsed Time about 1h
(466.666.668 Zeile(n) betroffen)
========== BASELINE ==========
(1 Zeile(n) betroffen)
#testEnvironment00000000000D-Tabelle. Scananzahl 1, logische Lesevorgänge 4, physische Lesevorgänge 1, Read-Ahead-Lesevorgänge 0, logische LOB-Lesevorgänge 0, physische LOB-Lesevorgänge 0, Read-Ahead-LOB-Lesevorgänge 0.
SQL Server-Ausführungszeiten:
, CPU-Zeit = 0 ms, verstrichene Zeit = 12 ms.
========== CELKO's first Solution ==========
(1 Zeile(n) betroffen)
#testEnvironment00000000000D-Tabelle. Scananzahl 2, logische Lesevorgänge 1506118, physische Lesevorgänge 877, Read-Ahead-Lesevorgänge 487104, logische LOB-Lesevorgänge 0, physische LOB-Lesevorgänge 0, Read-Ahead-LOB-Lesevorgänge 0.
SQL Server-Ausführungszeiten:
, CPU-Zeit = 207996 ms, verstrichene Zeit = 258317 ms.
========== TALLY TABLE Solution ==========
(1 Zeile(n) betroffen)
#testEnvironment00000000000D-Tabelle. Scananzahl 0, logische Lesevorgänge 266804, physische Lesevorgänge 68, Read-Ahead-Lesevorgänge 5, logische LOB-Lesevorgänge 0, physische LOB-Lesevorgänge 0, Read-Ahead-LOB-Lesevorgänge 0.
Worktable-Tabelle. Scananzahl 0, logische Lesevorgänge 0, physische Lesevorgänge 0, Read-Ahead-Lesevorgänge 0, logische LOB-Lesevorgänge 0, physische LOB-Lesevorgänge 0, Read-Ahead-LOB-Lesevorgänge 0.
SQL Server-Ausführungszeiten:
, CPU-Zeit = 264 ms, verstrichene Zeit = 476 ms.
========== ROW_NUMBER Solution ==========
(1 Zeile(n) betroffen)
#testEnvironment00000000000D-Tabelle. Scananzahl 2, logische Lesevorgänge 8, physische Lesevorgänge 0, Read-Ahead-Lesevorgänge 0, logische LOB-Lesevorgänge 0, physische LOB-Lesevorgänge 0, Read-Ahead-LOB-Lesevorgänge 0.
SQL Server-Ausführungszeiten:
, CPU-Zeit = 0 ms, verstrichene Zeit = 4 ms.
========== CELKO's second Solution ==========
(1 Zeile(n) betroffen)
#testEnvironment00000000000D-Tabelle. Scananzahl 4, logische Lesevorgänge 3012236, physische Lesevorgänge 604, Read-Ahead-Lesevorgänge 1479661, logische LOB-Lesevorgänge 0, physische LOB-Lesevorgänge 0, Read-Ahead-LOB-Lesevorgänge 0.
SQL Server-Ausführungszeiten:
, CPU-Zeit = 432388 ms, verstrichene Zeit = 1726745 ms.
Conclusion:
Celko's solutions both are beyond all affairs, not only presenting wrong results, but also excruciatingly slow.
#testEnvironment without Index: Tally beats my RowNumber
#testEnvironment with Index: my RowNumber beats Tally
Never give up 😉
November 16, 2011 at 6:19 am
rahu21 (11/16/2011)
Conclusion:Celko's solutions both are beyond all affairs, not only presenting wrong results, but also excruciatingly slow.
#testEnvironment without Index: Tally beats my RowNumber
#testEnvironment with Index: my RowNumber beats Tally
Never give up 😉
Agreed. Although in my tests the TALLY table fairs better and the ROW_NUMBER fairs worse, I did find that generally if there was an index that the ROW_NUMBER solution was faster (the TALLY table that I posted that beat the ROW_NUMBER on an indexed table turned out to have a bug in it).
With that in mind, is there any reason to keep the ORDER BY that you've used?
SELECT TOP 1 RowNumber as FirstUnusedID
FROM (SELECT ROW_NUMBER() OVER(ORDER BY ID) AS RowNumber,ID
FROM #testEnvironment )T
WHERE RowNumber <> ID
ORDER BY ID
The above appears to get the same results but has less IO.
IF object_id('tempdb..#testEnvironment') IS NOT NULL
BEGIN
DROP TABLE #testEnvironment
END
DECLARE @numberToBeDeleted INT
SELECT @numberToBeDeleted = (ABS(CHECKSUM(NEWID())) % 60000) + 1
SELECT TOP 1000000 IDENTITY(INT,1,1) AS ID
INTO #testEnvironment
FROM master.dbo.syscolumns sc1, master.dbo.syscolumns sc2, master.dbo.syscolumns sc3
CREATE NONCLUSTERED INDEX [test_index]
ON #testEnvironment (ID)
DELETE FROM #testEnvironment
WHERE ID = @numberToBeDeleted
PRINT '========== BASELINE =========='
SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT MIN(ID)
FROM #testEnvironment
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
PRINT '========== CELKO''s first Solution =========='
SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT MIN (ID +1) AS unused_user_id_min
FROM #testEnvironment
WHERE (ID + 1) NOT IN (SELECT ID FROM #testEnvironment);
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
PRINT '========== TALLY TABLE Solution =========='
--Use a real tally table for performance, here's one on the fly for testing.
--If you're unsure about what a tally table is, or what it's for, please read
--this article --> http://www.sqlservercentral.com/articles/T-SQL/62867/
SET STATISTICS IO ON
SET STATISTICS TIME ON
;WITH t1 AS (SELECT 1 N UNION ALL SELECT 1 N),
t2 AS (SELECT 1 N FROM t1 x, t1 y),
t3 AS (SELECT 1 N FROM t2 x, t2 y),
t4 AS (SELECT 1 N FROM t3 x, t3 y),
tally AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS num
FROM t4 x, t4 y)
SELECT MIN(num) AS nextID
FROM #testEnvironment a
RIGHT OUTER JOIN tally b ON a.ID = b.num
WHERE ID IS NULL
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
PRINT '========== ROW_NUMBER Solution =========='
SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT TOP 1 RowNumber as FirstUnusedID FROM (SELECT ROW_NUMBER() OVER(ORDER BY ID)
AS RowNumber,ID FROM #testEnvironment)T
WHERE RowNumber <> ID
ORDER BY 1,((SELECT MAX(ID) + 1 FROM #testEnvironment))
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
PRINT '========== ROW_NUMBER v2 Solution =========='
SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT TOP 1 RowNumber as FirstUnusedID
FROM (SELECT ROW_NUMBER() OVER(ORDER BY ID) AS RowNumber,ID
FROM #testEnvironment )T
WHERE RowNumber <> ID
ORDER BY ID
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
PRINT '========== CELKO''s second Solution =========='
SET STATISTICS IO ON
SET STATISTICS TIME ON
;WITH Available (ID) AS (
SELECT ID -1
FROM #testEnvironment
WHERE (ID -1) NOT IN (SELECT ID FROM #testEnvironment)
UNION ALL
SELECT ID +1
FROM #testEnvironment
WHERE (ID +1) NOT IN (SELECT ID FROM #testEnvironment))
SELECT MIN (ID)
FROM Available
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
--CleanUp
IF object_id('tempdb..#testEnvironment') IS NOT NULL
BEGIN
DROP TABLE #testEnvironment
END
========== BASELINE ==========
(1 row(s) affected)
Table '#testEnvironment____________________________________________________________________________________________________00000000008C'. Scan count 1, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
========== CELKO's first Solution ==========
(1 row(s) affected)
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#testEnvironment____________________________________________________________________________________________________00000000008C'. Scan count 2, logical reads 3734, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 1060 ms, elapsed time = 1113 ms.
========== TALLY TABLE Solution ==========
(1 row(s) affected)
Table '#testEnvironment____________________________________________________________________________________________________00000000008C'. Scan count 1, logical reads 132, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 31 ms, elapsed time = 37 ms.
========== ROW_NUMBER Solution ==========
(1 row(s) affected)
Table '#testEnvironment____________________________________________________________________________________________________00000000008C'. Scan count 2, logical reads 65, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 15 ms, elapsed time = 8 ms.
========== ROW_NUMBER v2 Solution ==========
(1 row(s) affected)
Table '#testEnvironment____________________________________________________________________________________________________00000000008C'. Scan count 1, logical reads 62, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 8 ms.
========== CELKO's second Solution ==========
(1 row(s) affected)
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#testEnvironment____________________________________________________________________________________________________00000000008C'. Scan count 4, logical reads 7468, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 1981 ms, elapsed time = 1996 ms.
November 16, 2011 at 7:13 am
Can someone please do a test where there are 1M rows (or 10M or more) and the MIN missing value is one from the highest number in the range (i.e. 999,999 if you do a million row test)? I want to see if row_number/tally table solutions are affected by having a very high min number available. I would test but my laptop has a pair of SSDs and 16GB of RAM so everything is fast. Well, except for Joe Celko's solutions! :w00t:
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
November 16, 2011 at 8:39 am
TheSQLGuru (11/16/2011)
Can someone please do a test where there are 1M rows (or 10M or more) and the MIN missing value is one from the highest number in the range (i.e. 999,999 if you do a million row test)? I want to see if row_number/tally table solutions are affected by having a very high min number available. I would test but my laptop has a pair of SSDs and 16GB of RAM so everything is fast. Well, except for Joe Celko's solutions! :w00t:
Sorry, forgot about this. I've set it running on a virtual machine with a 600GB hdd, but only had 4GB of RAM to give it. Will let you know 🙂
USE [Test]
SET NOCOUNT ON
PRINT '========== Ensure Table is ready =========='
IF object_id('[Test].dbo.testEnvironment') IS NOT NULL
BEGIN
DROP TABLE [Test].dbo.testEnvironment
END
PRINT REPLICATE('=',80)
PRINT '========== Insert 500,000,000(!!) rows =========='
SELECT TOP 500000000 IDENTITY(INT,1,1) AS ID
INTO [Test].dbo.testEnvironment
FROM master.dbo.syscolumns sc1, master.dbo.syscolumns sc2, master.dbo.syscolumns sc3,
master.dbo.syscolumns sc4, master.dbo.syscolumns sc5, master.dbo.syscolumns sc6,
master.dbo.syscolumns sc7, master.dbo.syscolumns sc8, master.dbo.syscolumns sc9,
master.dbo.syscolumns sc10, master.dbo.syscolumns sc11, master.dbo.syscolumns sc12
PRINT REPLICATE('=',80)
PRINT '========== Add Index =========='
CREATE NONCLUSTERED INDEX [test_index]
ON [Test].dbo.testEnvironment (ID)
PRINT REPLICATE('=',80)
PRINT '========== Delete 1 ID - 485,001,270 =========='
DELETE FROM [Test].dbo.testEnvironment
WHERE ID = 485001270
PRINT REPLICATE('=',80)
PRINT '========== Queries!! =========='
PRINT REPLICATE('=',80)
PRINT '========== BASELINE =========='
SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT MIN(ID)
FROM [Test].dbo.testEnvironment
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
PRINT REPLICATE('=',80)
PRINT '========== ROW_NUMBER Solution =========='
SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT TOP 1 RowNumber as FirstUnusedID FROM (SELECT ROW_NUMBER() OVER(ORDER BY ID)
AS RowNumber,ID FROM [Test].dbo.testEnvironment)T
WHERE RowNumber <> ID
ORDER BY 1,((SELECT MAX(ID) + 1 FROM [Test].dbo.testEnvironment))
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
PRINT REPLICATE('=',80)
PRINT '========== ROW_NUMBER v2 Solution =========='
SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT TOP 1 RowNumber as FirstUnusedID
FROM (SELECT ROW_NUMBER() OVER(ORDER BY ID) AS RowNumber,ID
FROM [Test].dbo.testEnvironment )T
WHERE RowNumber <> ID
ORDER BY ID
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
PRINT REPLICATE('=',80)
November 16, 2011 at 9:36 am
47 minutes 31 seconds later. . . 🙂
========== Ensure Table is ready ==========
================================================================================
========== Insert 500,000,000(!!) rows ==========
================================================================================
========== Add Index ==========
================================================================================
========== Delete 1 ID - 485,001,270 ==========
================================================================================
========== Queries!! ==========
================================================================================
========== BASELINE ==========
Table 'testEnvironment'. Scan count 1, logical reads 4, physical reads 1, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 29 ms.
================================================================================
========== ROW_NUMBER Solution ==========
Table 'testEnvironment'. Scan count 2, logical reads 902183, physical reads 571, read-ahead reads 877616, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 162662 ms, elapsed time = 173297 ms.
================================================================================
========== ROW_NUMBER v2 Solution ==========
Table 'testEnvironment'. Scan count 1, logical reads 902179, physical reads 712, read-ahead reads 908889, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 163832 ms, elapsed time = 176550 ms.
================================================================================
So both of the ROW_NUMBER solutions are looking at around 170 seconds on 500m rows. At some point I'll set-up a tally table with 500m rows that can attempt to compete.
November 17, 2011 at 2:19 am
TheSQLGuru (11/16/2011)
Can someone please do a test where there are 1M rows (or 10M or more) and the MIN missing value is one from the highest number in the range (i.e. 999,999 if you do a million row test)? I want to see if row_number/tally table solutions are affected by having a very high min number available. I would test but my laptop has a pair of SSDs and 16GB of RAM so everything is fast. Well, except for Joe Celko's solutions! :w00t:
And here's 1m rows where the earliest missing number is 999,999.
IF object_id('tempdb..#testEnvironment') IS NOT NULL
BEGIN
DROP TABLE #testEnvironment
END
SELECT TOP 1000000 IDENTITY(INT,1,1) AS ID
INTO #testEnvironment
FROM master.dbo.syscolumns sc1, master.dbo.syscolumns sc2, master.dbo.syscolumns sc3
CREATE NONCLUSTERED INDEX [test_index]
ON #testEnvironment (ID)
DELETE FROM #testEnvironment
WHERE ID = 999999
PRINT '========== BASELINE =========='
SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT MIN(ID)
FROM #testEnvironment
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
PRINT '========== CELKO''s first Solution =========='
SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT MIN (ID +1) AS unused_user_id_min
FROM #testEnvironment
WHERE (ID + 1) NOT IN (SELECT ID FROM #testEnvironment);
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
PRINT '========== TALLY TABLE Solution =========='
--If you're unsure about what a tally table is, or what it's for, please read
--this article --> http://www.sqlservercentral.com/articles/T-SQL/62867/
--===== Create and populate the Tally table on the fly
IF object_id('dbo.Tally') IS NOT NULL
BEGIN
DROP TABLE dbo.Tally
END
SELECT TOP 1000000
IDENTITY(INT,1,1) AS num
INTO dbo.Tally
FROM master.dbo.syscolumns sc1, master.dbo.syscolumns sc2, master.dbo.syscolumns sc3
--===== Add a Primary Key to maximize performance
ALTER TABLE dbo.Tally
ADD CONSTRAINT PK_Tally_num
PRIMARY KEY CLUSTERED (num) WITH FILLFACTOR = 100
SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT MIN(num) AS nextID
FROM #testEnvironment a
RIGHT OUTER JOIN dbo.Tally b ON a.ID = b.num
WHERE ID IS NULL
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
PRINT '========== ROW_NUMBER Solution =========='
SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT TOP 1 RowNumber as FirstUnusedID FROM (SELECT ROW_NUMBER() OVER(ORDER BY ID)
AS RowNumber,ID FROM #testEnvironment)T
WHERE RowNumber <> ID
ORDER BY 1,((SELECT MAX(ID) + 1 FROM #testEnvironment))
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
PRINT '========== ROW_NUMBER v2 Solution =========='
SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT TOP 1 RowNumber as FirstUnusedID
FROM (SELECT ROW_NUMBER() OVER(ORDER BY ID) AS RowNumber,ID
FROM #testEnvironment )T
WHERE RowNumber <> ID
ORDER BY ID
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
PRINT '========== CELKO''s second Solution =========='
SET STATISTICS IO ON
SET STATISTICS TIME ON
;WITH Available (ID) AS (
SELECT ID -1
FROM #testEnvironment
WHERE (ID -1) NOT IN (SELECT ID FROM #testEnvironment)
UNION ALL
SELECT ID +1
FROM #testEnvironment
WHERE (ID +1) NOT IN (SELECT ID FROM #testEnvironment))
SELECT MIN (ID)
FROM Available
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
========== BASELINE ==========
(1 row(s) affected)
Table '#testEnvironment____________________________________________________________________________________________________0000000000C3'. Scan count 1, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
========== CELKO's first Solution ==========
(1 row(s) affected)
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#testEnvironment____________________________________________________________________________________________________0000000000C3'. Scan count 2, logical reads 3734, physical reads 0, read-ahead reads 15, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 967 ms, elapsed time = 974 ms.
========== TALLY TABLE Solution ==========
(1000000 row(s) affected)
(1 row(s) affected)
Table '#testEnvironment____________________________________________________________________________________________________0000000000C3'. Scan count 1, logical reads 1867, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Tally'. Scan count 1, logical reads 1616, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 405 ms, elapsed time = 405 ms.
========== ROW_NUMBER Solution ==========
(1 row(s) affected)
Table '#testEnvironment____________________________________________________________________________________________________0000000000C3'. Scan count 2, logical reads 1870, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 281 ms, elapsed time = 287 ms.
========== ROW_NUMBER v2 Solution ==========
(1 row(s) affected)
Table '#testEnvironment____________________________________________________________________________________________________0000000000C3'. Scan count 1, logical reads 1867, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 328 ms, elapsed time = 326 ms.
========== CELKO's second Solution ==========
(1 row(s) affected)
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#testEnvironment____________________________________________________________________________________________________0000000000C3'. Scan count 4, logical reads 7468, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 2106 ms, elapsed time = 2126 ms.
November 17, 2011 at 4:36 am
Last one big test.
USE [Test]
SET NOCOUNT ON
PRINT '========== Ensure Table is ready =========='
IF object_id('[Test].dbo.testEnvironment') IS NOT NULL
BEGIN
DROP TABLE [Test].dbo.testEnvironment
END
PRINT REPLICATE('=',80)
PRINT '========== Insert 500,000,000(!!) rows =========='
SELECT TOP 500000000 IDENTITY(INT,1,1) AS ID
INTO [Test].dbo.testEnvironment
FROM master.dbo.syscolumns sc1, master.dbo.syscolumns sc2, master.dbo.syscolumns sc3,
master.dbo.syscolumns sc4, master.dbo.syscolumns sc5, master.dbo.syscolumns sc6,
master.dbo.syscolumns sc7, master.dbo.syscolumns sc8, master.dbo.syscolumns sc9,
master.dbo.syscolumns sc10, master.dbo.syscolumns sc11, master.dbo.syscolumns sc12
PRINT REPLICATE('=',80)
PRINT '========== Add Index =========='
CREATE NONCLUSTERED INDEX [test_index]
ON [Test].dbo.testEnvironment (ID)
PRINT REPLICATE('=',80)
PRINT '========== Delete 1 ID - 485,001,270 =========='
DELETE FROM [Test].dbo.testEnvironment
WHERE ID = 485001270
PRINT REPLICATE('=',80)
PRINT '========== Queries!! =========='
PRINT REPLICATE('=',80)
PRINT '========== BASELINE =========='
SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT MIN(ID)
FROM [Test].dbo.testEnvironment
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
PRINT REPLICATE('=',80)
PRINT '========== ROW_NUMBER Solution =========='
SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT TOP 1 RowNumber as FirstUnusedID FROM (SELECT ROW_NUMBER() OVER(ORDER BY ID)
AS RowNumber,ID FROM [Test].dbo.testEnvironment)T
WHERE RowNumber <> ID
ORDER BY 1,((SELECT MAX(ID) + 1 FROM [Test].dbo.testEnvironment))
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
PRINT REPLICATE('=',80)
PRINT '========== ROW_NUMBER v2 Solution =========='
SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT TOP 1 RowNumber as FirstUnusedID
FROM (SELECT ROW_NUMBER() OVER(ORDER BY ID) AS RowNumber,ID
FROM [Test].dbo.testEnvironment )T
WHERE RowNumber <> ID
ORDER BY ID
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
PRINT REPLICATE('=',80)
PRINT '========== TALLY TABLE Solution =========='
--If you're unsure about what a tally table is, or what it's for, please read
--this article --> http://www.sqlservercentral.com/articles/T-SQL/62867/
--===== Create and populate the Tally table on the fly
IF object_id('[Test].dbo.Tally') IS NOT NULL
BEGIN
DROP TABLE [Test].dbo.Tally
END
SELECT TOP 500000000
IDENTITY(INT,1,1) AS num
INTO dbo.Tally
FROM master.dbo.syscolumns sc1, master.dbo.syscolumns sc2, master.dbo.syscolumns sc3,
master.dbo.syscolumns sc4, master.dbo.syscolumns sc5, master.dbo.syscolumns sc6,
master.dbo.syscolumns sc7, master.dbo.syscolumns sc8, master.dbo.syscolumns sc9,
master.dbo.syscolumns sc10, master.dbo.syscolumns sc11, master.dbo.syscolumns sc12
--===== Add a Primary Key to maximize performance
ALTER TABLE [Test].dbo.Tally
ADD CONSTRAINT PK_Tally_num
PRIMARY KEY CLUSTERED (num) WITH FILLFACTOR = 100
SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT MIN(num) AS nextID
FROM [Test].dbo.testEnvironment a
RIGHT OUTER JOIN [Test].dbo.Tally b ON a.ID = b.num
WHERE ID IS NULL
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
PRINT REPLICATE('=',80)
PRINT '========== CELKO''s first Solution =========='
SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT MIN (ID +1) AS unused_user_id_min
FROM [Test].dbo.testEnvironment
WHERE (ID + 1) NOT IN (SELECT ID FROM [Test].dbo.testEnvironment);
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
PRINT REPLICATE('=',80)
PRINT '========== Clean Up =========='
IF object_id('[Test].dbo.testEnvironment') IS NOT NULL
BEGIN
DROP TABLE [Test].dbo.testEnvironment
END
IF object_id('[Test].dbo.Tally') IS NOT NULL
BEGIN
DROP TABLE [Test].dbo.Tally
END
PRINT REPLICATE('=',80)
========== Ensure Table is ready ==========
================================================================================
========== Insert 500,000,000(!!) rows ==========
================================================================================
========== Add Index ==========
================================================================================
========== Delete 1 ID - 485,001,270 ==========
================================================================================
========== Queries!! ==========
================================================================================
========== BASELINE ==========
Table 'testEnvironment'. Scan count 1, logical reads 4, physical reads 1, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 13 ms.
================================================================================
========== ROW_NUMBER Solution ==========
Table 'testEnvironment'. Scan count 2, logical reads 902182, physical reads 1434, read-ahead reads 471399, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 152304 ms, elapsed time = 157702 ms.
================================================================================
========== ROW_NUMBER v2 Solution ==========
Table 'testEnvironment'. Scan count 1, logical reads 902178, physical reads 499, read-ahead reads 476329, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 162007 ms, elapsed time = 186641 ms.
================================================================================
========== TALLY TABLE Solution ==========
Table 'testEnvironment'. Scan count 1, logical reads 930060, physical reads 768, read-ahead reads 930045, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Tally'. Scan count 1, logical reads 805158, physical reads 1102, read-ahead reads 491970, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 218323 ms, elapsed time = 263700 ms.
================================================================================
========== CELKO's first Solution ==========
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'testEnvironment'. Scan count 2, logical reads 1860120, physical reads 560, read-ahead reads 930049, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 513556 ms, elapsed time = 519127 ms.
================================================================================
========== Clean Up ==========
================================================================================
November 17, 2011 at 10:52 am
Thanks for the awesome testing! Seems that the rownumber solution is a fair bit quicker than the tally table solution for larger numbers where the missing value is high up the value range.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
Viewing 9 posts - 31 through 38 (of 38 total)
You must be logged in to reply to this topic. Login to reply