June 18, 2006 at 2:15 pm
June 18, 2006 at 10:10 pm
yep... that'll work Tim... most folks don't come close to that level of correct simplicity when using a sequence table. In light of IDENTITY's, though, I still can't see anyone intentionally doing it that way (sequence table) unless they have some ill-conceived notion about code portability.
So far as NOLOCKs go, you are correct... if you rely on Rollbacks instead of checking data before it goes in or you haven't separated transactional tables from historical tables, then, Yes, you need to be careful. So far as Isolationtion Levels go, only report runs should use them.
--Jeff Moden
Change is inevitable... Change for the better is not.
June 18, 2006 at 11:04 pm
Pardon my ignorance, but why should only reports use isolation levels?
Is it because the queries for reports are long running queries and dont need to keep switching the isolation levels on and off all the time, which could result in isolation level conflicts.
June 19, 2006 at 5:16 am
nileshsane: Every SQL statement uses an isolation level. Maybe Jeff means: only reports should use READ UNCOMMITTED since they are generally running in a read-only environment?
BTW, using NOLOCK means you could see all sorts of crazy stuff. If spid1 ignores locks taken out by spid2, that's not the same thing as spid1 treating spid2's queries as though they weren't in a transaction. Spid1 will also ignore the transactional properties of individual SQL statements, so you could retrieve a half-updated row, or n records of an m-row insert, where 0<n<m.
Tim Wilkinson
"If it doesn't work in practice, you're using the wrong theory"
- Immanuel Kant
June 19, 2006 at 6:40 am
Tim,
I don't understand why the use of an IDENTITY column would require 100 singleton inserts in the scenario you've given... also, why would you build a system that requires blocks of keys to be contiguous? There are certainly ways to do that including a sequence table or not, but why make such a requirement? If you ever need to add an item to the "group", seems like you're kinda out of luck after the fact.
So far as returning all the keys to the client, if the data is formed correctly, there will be an order number or something in the row to identify the transaction... or, at least, seems like there should be... producing the list of disparate keys would be simple. You're right, the sequence table would do a lot of this quite nicely, I'm still wondering why anyone would build a system that required entries to be in contiguous blocks of ID's.
nileshsane,
Yep, Tim is correct... I meant the SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED set should only be used in reports. If you use it views and, in some circumstances (usually multi-call RBAR), you will evenutally run into an ISOLATION LEVEL CONFLICT error and it's usually when it's most inconvenient for it to happen.
I said it would be ok to use them in reports (generally, reporting stored procs) because the data they produce is not generally used as a source of info for other procs...
--Jeff Moden
Change is inevitable... Change for the better is not.
June 19, 2006 at 8:16 am
>why would you build a system that requires blocks of keys to be contiguous?
this did actually happen. I rebuilt it becuase it was my job. I didn't specify it, needless to say. I think they wanted to send out a contiguous block of numbers to customers for initial bookings, after which it was acceptable (and as you point out, unavoidable) to have subsequent additional blocks not contiguous with the previous blocks. The numbers within a single insert were also supposed to be ascending under a certain alphabetical ordering of the new records. That also couldn't be guaranteed with the identity property. At least the sharing of these IDs with customers meant that for once our key was a real semantic PK and not a surrogate!
There were in any case critical concurrency problems at the (necessarily very high and narrow) seasonal peak, which meant that even the extra operation to retrieve the new ids from the base table was unacceptable, because unnecessary.
>There are certainly ways to do that including a sequence table or not
I'd be genuinely interested (and somewhat embarrassed!) to know how you could efficiently do it with an identity column.
>it would be ok to use them in reports (generally, reporting stored procs) because the data they produce is not generally used as a source of info for other procs...
No, just people!
Tim Wilkinson
"If it doesn't work in practice, you're using the wrong theory"
- Immanuel Kant
June 19, 2006 at 7:16 pm
Man, I know what you mean with picking up other people's mess Guess that's why this forum is so popular I had to ask because I got stuck with a similar legacy requirement.
Here's a way to do what you asked without a sequence table... to review, here are the specs you mentioned before...
Here's my "alternate" method... don't balk at the exclusive table lock until you try it... it inserts 100 records in less than a millisecond and does 1,000 records in ~21 milliseconds on a million row table...
--===== Declare the ID range variables (you probably have something like this, already) DECLARE @StartRange INT DECLARE @EndRange INT
--===== Do a table-locked insert... takes 16 to 60 mils (avg of 21) for 1000 records INSERT INTO sometable WITH (TABLOCKX) (some-column-list) SELECT some-column-list FROM some-working-table WITH (NOLOCK) ORDER BY some-column-list
--===== Capture the range of identities SELECT @StartRange = SCOPE_IDENTITY() - @@ROWCOUNT + 1, @EndRange = SCOPE_IDENTITY()
Of course, you'd need one of these for each table you want to insert to and, while that seems complicated, it's only one Select longer than each insert (and, it's on system variables, not tables) and you don't have to mess around with a sequence table or resynch it if it get's out of synch when someone updates it incorrectly. A couple of hidden advantages are that you don't need to know the count of records ahead of time (to reserve the contiguous block) and you're guaranteed no deadlocks because the TABLOCKX forces a "wait" instead of a deadlock.
In the testing I did, way back when, I had 14 simultaneous identical sets of code with the same WAITFOR TIME... They all lit off at the same time, they all patiently waited their turn, all of them completed in 296 milliseconds, and they all successfully inserted 1000 rows with contiguous numbers (in the IDENTITY column) on a million row table with no deadlocks.
Who said an exclusive table lock is always bad ?
If you want to make it a tiny bit more "Gumby proof", define a variable to capture the rowcount so no-one put's anything between the INSERT and the capture of the rowcount.
Anyway, hope this helps... it's worked worked very well for me in the past when I got stuck with a legacy requirement like you did... especially since I said "No Way" to that God-awful sequence table.
p.s. Look Ma! No transaction!!
{EDIT} forgot the ORDER BY in the code...
--Jeff Moden
Change is inevitable... Change for the better is not.
June 19, 2009 at 8:32 am
Hey Jeff,
For added safety on 2K5 and 2K8 (pre-SP1) you need a MAXDOP (1) before the SCOPE_IDENTITY to avoid the bug described here: http://connect.microsoft.com/SQL/feedback/ViewFeedback.aspx?FeedbackID=328811
In the interests of fair play and to restore the balance of the universe, I feel I should defend the sequence table a bit. The sequence table is not always a horrible choice. Well implemented, it can have advantages, as I hope I can show. As always, there are very few absolutes with SQL Server: even with sequence tables...it depends!
A requirement I have come across more than once in real life, is for an application to request a block of IDs before doing any INSERTs. An example would be for a middle-tier which processes jobs on behalf of many clients. In that case, the application needs to have the IDs beforehand so it can use the correct values in all related insert operations (e.g. foreign keys). I am not aware of a way to achieve this with IDENTITY.
To cut down on network round-trips, such applications often pre-allocate a range of IDs, construct a sequence of transactions and batches, and then fire them at the SQL Server. I have worked on at least one large (third-party!) application that worked in this way. The 'sequence table' can be made to work well in that scenario.
My other concern, predictably I guess, is with the TABLOCKX. If we have a system with an appreciable number of tables using your technique (say a hundred for argument's sake) and several thousands of insert transactions per second, I think that serializing access to all tables in this way could well be problematic. (For clarity, one connection adding rows to one table would prevent concurrent inserts to all other tables in the database using this scheme.)
Also, there is always a transaction when a change is made to the database - even if it's not an explicit one. The script I am including in this post contains a demonstration of this by showing the log records written in each case.
Last, before the script, I should point out that IDENTITY can be a pig to work with when replication is used. Updatable subscriptions can be very difficult in this respect.
Anyway, onto the script. This doesn't directly cover the case under discussion, but I am hoping you will allow some leeway. The idea is to demonstrate how to use a sequence table to maximize concurrency, achieve excellent performance, and provide flexibility for both database code and external applications. The operations are also entirely atomic - without explicit transactions.
I should confess that I have never yet chosen to implement a sequence table in my own designs - IDENTITY is just so easy and convenient. Nevertheless, I believe it has its place, especially in multi-tier OLTP applications.
Script attached, because it's quite long, and the formatting...well I don't need to tell you.
Cheers,
Paul
Viewing 8 posts - 16 through 22 (of 22 total)
You must be logged in to reply to this topic. Login to reply