January 9, 2009 at 8:52 pm
foxjazz (1/9/2009)
I realize you guys aren't mind readers, maybe I should restate the requirements to say lowest possible positive unused value.I found that I already had an identity set, so I didn't use the identity_insert feature and just let sql server do what it does best. Count.
Ah, but we are... that's why I gave you not 1, but 2 correct and performant answers to meet your requirements. And... I did it without asking any "stupid" questions. Ya just gotta be happy with that. 😉
Anyway and as you can tell, we all think it's good move on your part that you're using Identity and not worrying about filling in the gaps. That would make for performance problems, possibly deadlocks, and just a whole bunch of unnecessary code not to mention possible data integrity problems.
--Jeff Moden
Change is inevitable... Change for the better is not.
January 9, 2009 at 9:07 pm
Grant Fritchey (1/9/2009)
There's a whole section on identifying gaps in Itzik Ben-Gan's book T-SQL Querying. Here's a way to list all the gaps, their start points and stop points:
SELECT cur+1 AS start_range, nxt-1 AS end_range
FROM (SELECT col1 AS cur,
(SELECT MIN(Col1) FROM dbo.T1 AS B
WHERE b.col1 > A.col1) AS nxt
FROM dbo.T1 as A) AS D
WHERE nxt - cur > 1;
Muuuust... reeeesist... temptaaation... tooooo... raaaace.... [font="Arial Black"]GAAAHHH!!!![/font]
[font="Arial Black"]"Holy shades of Pavlov's dog, Batman... are those pork chops in your utility belt?"[/font] 😛 Ok, it's off to the races we go again.
First... we need a couple of million rows of test data... as always, comments are in the code... it takes about 57 seconds on my humble, 6 year old, P4 1.8 Ghz desktop box to complete...
DROP TABLE #JbmTest
--GO
--===== Setup for speed and to prevent blocking
SET NOCOUNT ON
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
--=============================================================================
-- Create an experimental table to simulate the table being examined
-- Again... 6 million rows...
--=============================================================================
--===== Create the experimental temp table and populate with Serial #'s on the fly
-- This works because SysColumns always has at least 4000 entries
-- even in a new database and 4000*4000 = 16,000,000
SELECT TOP 6000000 SerialNumber = IDENTITY(INT, 1, 1)
INTO #JbmTest
FROM Master.dbo.SysColumns sc1,
Master.dbo.SysColumns sc2
-- --===== Like any good table, our experimental table needs a Primary Key
ALTER TABLE #JbmTest
ADD PRIMARY KEY CLUSTERED (SerialNumber)
-- This deletes a couple of "monster" ranges just to see how it's handled.
DELETE #JbmTest
WHERE (SerialNumber BETWEEN 10000 AND 20000)
OR (SerialNumber BETWEEN 1000000 AND 2000000)
OR (SerialNumber BETWEEN 5000000 AND 5500000)
-- This deletes every third row in the first 1000 rows
DELETE #JbmTest
WHERE SerialNumber %3 = 0
AND SerialNumber <= 1000
PRINT REPLICATE('=',100)
... and now, the race code... put your result pane in the grid mode and let 'er rip. The peformance measurements will be in the "messages" tab...
--=============================================================================
-- Test the code
--=============================================================================
SET NOCOUNT OFF
PRINT REPLICATE('=',78)
PRINT '========== Calculated Gaps by Jeff Moden =========='
SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT GapStart = (SELECT ISNULL(MAX(b.SerialNumber),0)+1
FROM #JbmTest b
WHERE b.SerialNumber < a.SerialNumber),
GapEnd = SerialNumber - 1
FROM #JbmTest a
WHERE a.SerialNumber - 1 NOT IN (SELECT SerialNumber FROM #JbmTest)
AND a.SerialNumber - 1 > 0
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
PRINT REPLICATE('=',78)
-------------------------------------------------------------------------------
PRINT '========== Itzik''s rendition of the same thing =========='
SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT cur+1 AS start_range,
nxt-1 AS end_range
FROM (SELECT SerialNumber AS cur, (SELECT MIN(SerialNumber)
FROM #JbmTest AS B
WHERE b.SerialNumber > A.SerialNumber) AS nxt
FROM #JbmTest as A) AS D
WHERE nxt - cur > 1;
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
PRINT REPLICATE('=',78)
Here's the results on my machine...
==============================================================================
========== Calculated Gaps by Jeff Moden ==========
(336 row(s) affected)
Table '#JbmTest____________________________________________________________________________________________________________0000000000D9'.
Scan count 338, logical reads 15495, physical reads 1, read-ahead reads 5002, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 8250 ms, elapsed time = 9173 ms.
==============================================================================
========== Itzik's rendition of the same thing ==========
(336 row(s) affected)
Table '#JbmTest____________________________________________________________________________________________________________0000000000D9'.
Scan count 4489665, logical reads 14325293, physical reads 3, read-ahead reads 4889, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 64390 ms, elapsed time = 67479 ms.
==============================================================================
Ya just can't use a triangular join in a correlated subquery in the FROM clause and expect to get any performance out of it. If you don't know what a "triangular join" or why they're bad, the article on Triangular Joins is going to be reprinted on SSC on Friday Jan 16, 2009.
--Jeff Moden
Change is inevitable... Change for the better is not.
January 9, 2009 at 10:06 pm
Okay Jeff, I ran your code on my system here at home, and here are the results:
====================================================================================================
==============================================================================
========== Calculated Gaps by Jeff Moden ==========
(334 row(s) affected)
Table '#JbmTest____________________________________________________________________________________________________________00000000007C'.
Scan count 336, logical reads 4207, physical reads 0, read-ahead reads 5, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 1266 ms, elapsed time = 1391 ms.
==============================================================================
========== Itzik's rendition of the same thing ==========
(334 row(s) affected)
Table '#JbmTest____________________________________________________________________________________________________________00000000007C'.
Scan count 989668, logical reads 3157802, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 16532 ms, elapsed time = 8748 ms.
==============================================================================
Total time for ALL processing, including building the test table about 31 seconds.
Curious how long it would run on my development server at work.
January 9, 2009 at 10:21 pm
This you will find interesting. My dev box at work is dual quad-core processor system with 8 GB RAM and direct attach SCSI discs:
====================================================================================================
==============================================================================
========== Calculated Gaps by Jeff Moden ==========
(336 row(s) affected)
Table '#JbmTest____________________________________________________________________________________________________________000000000790'.
Scan count 354, logical reads 15505, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 28892 ms, elapsed time = 3784 ms.
==============================================================================
========== Itzik's rendition of the same thing ==========
(336 row(s) affected)
Table '#JbmTest____________________________________________________________________________________________________________000000000790'.
Scan count 4489673, logical reads 14325482, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 86545 ms, elapsed time = 11164 ms.
==============================================================================
==============================================================================
========== Calculated Gaps by Jeff Moden With MAXDOP 1==========
(336 row(s) affected)
Table '#JbmTest____________________________________________________________________________________________________________000000000790'.
Scan count 338, logical reads 15507, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 5547 ms, elapsed time = 5542 ms.
==============================================================================
========== Itzik's rendition of the same thing With MAXDOP 1 ==========
(336 row(s) affected)
Table '#JbmTest____________________________________________________________________________________________________________000000000790'.
Scan count 4489665, logical reads 14325302, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 44859 ms, elapsed time = 44855 ms.
==============================================================================
January 9, 2009 at 11:46 pm
Wow... I'm thinking the dev box has a major problem.
--Jeff Moden
Change is inevitable... Change for the better is not.
January 10, 2009 at 1:17 am
Sorry, I was watching the last 2 episodes of StarGate Atlantis (had them on DVR).
Or it could be a result of the Raid 5 configuration of the hard disks as well.
January 10, 2009 at 1:51 am
Lynn Pettis (1/10/2009)
Sorry, I was watching the last 2 episodes of StarGate Atlantis (had them on DVR).
Any good?
Or it could be a result of the Raid 5 configuration of the hard disks as well.
It looks like the exec plan's different. The logical IOs are waaaay different. 14 million logical IOs on the server and only 3 million on your desktop.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 10, 2009 at 8:38 am
The 14 million logical reads for Itzik's code is pretty much what I'd expect out of the triangular join he built into his code. Why that dropped to only 3 million rows on Lynn's machine is a mystery.
What's really a mystery to me is Lynn's dev box when parallelism kicks in... the times for both queries absolutely suck on that box until he adds the MAXDOP 1 option and then things go back down to more "normal" times.
I do 99.9% of my development on single processor boxes and the code get's moved to multi-processor boxes. I've not had any complaints about speed or scalability (quite the opposite) and I usually expect that parallelism kicking in will only make things faster, but after seeing what parallel processing did to this rather simple lookup, I may have to do my final performance checking on the targeted production servers themselves.
This really sucks... it's like saying an 8 cylinder drag racer can be beat by a lawn tractor. Heh... of course, the drag racer pretty much sucks at cutting grass on any day. 😛
--Jeff Moden
Change is inevitable... Change for the better is not.
January 10, 2009 at 8:42 am
Lynn, your home system absolutely smokes on this code... what does your home system consist of... I'm thinking that brand names and model numbers would also weigh in heavily here (so I can get one)... I'd really like to know what's under the hood of that bad boy. Thanks.
--Jeff Moden
Change is inevitable... Change for the better is not.
January 10, 2009 at 10:51 am
Jeff Moden (1/10/2009)
Lynn, your home system absolutely smokes on this code... what does your home system consist of... I'm thinking that brand names and model numbers would also weigh in heavily here (so I can get one)... I'd really like to know what's under the hood of that bad boy. Thanks.
I like to call it a Nadar box. I bought it from a company in Denver called BCI. They used to do a LOT of work for my previous employer, Information Handling Services, until they went name brand for their computers. Plus, since I had worked at IHS, Nadar gave me a real nice discount on the machine. I paid about 1200 for it at the the time, and if I remember right, to purchase a similiar name brand system would have cost me between 1600 and 2000 dollars.
To figure out what is under the hood exactly, I'll have to open her up. I'll do that later today.
Two of my kids have swim lessons at 11:30 and I need to get them there, plus I need to work on my resume. I need to get it to the Dean of Electrical Engineering and Computer Science at Colorado Tech here in Colorado Springs. I am trying o get a part-time position as an adjunct professor and this is critical. I already talked to him yesterday for a few minutes and I really want to do this.
January 10, 2009 at 10:57 am
GilaMonster (1/10/2009)
Lynn Pettis (1/10/2009)
Sorry, I was watching the last 2 episodes of StarGate Atlantis (had them on DVR).Any good?
-- Yes, but still left threads from the series undone, but then they are coming out with another Stargate series, so we'll see.
I am really looking forward to the final episodes of Battlestar Gallactica starting next week.
Or it could be a result of the Raid 5 configuration of the hard disks as well.
It looks like the exec plan's different. The logical IOs are waaaay different. 14 million logical IOs on the server and only 3 million on your desktop.
It also could be due to disk fragmentation. It is a dev box, and we do restore and delete some databases on the system quite regularly. Plus the database I used on the dev system is a sandbox db that could be heavily fragments due to growths and shrinks (I know, bad, but it is only a sandbox used to test certain queries in a "safe" environment).
January 10, 2009 at 12:23 pm
Jeff Moden (1/10/2009)
What's really a mystery to me is Lynn's dev box when parallelism kicks in... the times for both queries absolutely suck on that box until he adds the MAXDOP 1 option and then things go back down to more "normal" times.
I've seen that several times on my desktop (Intel Core 2 Quad). SQL's allocated 2 cores out of the four. I've often found that a query run with maxdop 1 is waaay faster (2, 3 even 5 times) than the query if it parallels.
I haven't looked into it too much. It could be because there's other stuff running (as a desktop this is not exactly dedicated to SQL) and hence the parallelism skew is far higher than SQL expected, or because I'm working with data at the lower end of the parallelism threshold and the overhead of it's outweighing the benefit.
I certainly did not see this when I had massive multi-processor servers (16, 32 processors) to play with.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 10, 2009 at 12:39 pm
Lynn Pettis (1/10/2009)
I need to get it to the Dean of Electrical Engineering and Computer Science at Colorado Tech here in Colorado Springs. I am trying o get a part-time position as an adjunct professor and this is critical. I already talked to him yesterday for a few minutes and I really want to do this.
Good luck. If you get it, will you be able to use the Professor title?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 10, 2009 at 12:47 pm
GilaMonster (1/10/2009)
Lynn Pettis (1/10/2009)
I need to get it to the Dean of Electrical Engineering and Computer Science at Colorado Tech here in Colorado Springs. I am trying o get a part-time position as an adjunct professor and this is critical. I already talked to him yesterday for a few minutes and I really want to do this.Good luck. If you get it, will you be able to use the Professor title?
You know, I don't know? Maybe.
January 10, 2009 at 1:20 pm
Lynn Pettis (1/10/2009)
You know, I don't know? Maybe.
Now that will look impressive in your sig...
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
Viewing 15 posts - 16 through 30 (of 41 total)
You must be logged in to reply to this topic. Login to reply