January 22, 2014 at 11:03 am
OK - maybe my Google-fu is off today but I'm trying to find out if using a sequence table will provide values that are as unique as the GUID is supposed to be.
The reason is that I'm building a DW from the ground up (they won't let me expense my booze which I think is grossly unfair 😉 ) and rather than using GUID it ~seems~ like a sequence table would give better results. My concern is that the values won't be unique across tables.
Has anyone tested this or have links to explain whether or not this is good practice? TIA
January 22, 2014 at 11:08 am
Can you provide more detail on exactly what it is you are trying to accomplish and what your problem(s) actually is/are? Also please provide 3 or 4 tables and some values to demonstrate your 'uniqueness' issues.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
January 22, 2014 at 11:42 am
I'm doing design work so there's no objects at this point. From my reading it seems that the values are unique per database (which makes sense) so if this was in MSDB the values would be unique across the server or at least that's how I'm interpreting it.
Besides uniqueness my other concern would be running out of values even with a BIGINT when dealing with a data warehouse.
Is this any help in pointing me towards answers?
January 22, 2014 at 12:03 pm
The problem with a sequence table isn't the uniqueness (it works), it's the contention you get when everything needs to access the single point. Because of that, GUID is usually an easier choice overall.
It's a bit wide to work with, but it's functional.
If you manage to overwhelm a BIGINT, btw, I'll be pretty impressed. The range of it is: (-9,223,372,036,854,775,808) to (9,223,372,036,854,775,807). That's some pretty significant big data you've got there.
What are you trying to accomplish with either of these methods though? The purpose of sequence/guid/identity is to allow for a surrogate key at the table level. It's rare you'd care about overlapping identifiers in different tables unless you've split an entity, at which point you should have a central table for that entity and if necessary secondary attribute tables with information posted for those entities that would need it.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
January 22, 2014 at 12:38 pm
Thanks Kraig. That's exactly the kind of thing I needed to know to help make the decision. I was looking at uniqueness and contention didn't come to mind. I got too excited by the thought of being able to cluster on a surrogate key based on a sequence table as well.
I would be impressed if we overran a BIGINT myself but with dropped values from the sequence table and some of the code that would use it I could see blocks of millions of numbers being skipped. Those add up.
The main problem this won't solve is the need for uniqueness across servers so this wasn't being considered for that one. GUID is the way on those.
January 22, 2014 at 12:51 pm
JustMarie (1/22/2014)
The main problem this won't solve is the need for uniqueness across servers so this wasn't being considered for that one. GUID is the way on those.
Ah hah! You're dealing with merge replication? You can seed the identities differently if you want to split the servers and keep it as integers.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
January 22, 2014 at 1:23 pm
JustMarie (1/22/2014)
Thanks Kraig. That's exactly the kind of thing I needed to know to help make the decision. I was looking at uniqueness and contention didn't come to mind. I got too excited by the thought of being able to cluster on a surrogate key based on a sequence table as well.
Are you talking about a custom sequence table or the SQL 2012 object type Sequence (as in CREATE SEQUENCE)?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 22, 2014 at 1:49 pm
For uniqueness across servers, you can also use a second value, which can often be just a smallint, that identifies the original server that assigned the id. That way, the combination of ( originating_server, sequence_number ) is always unique.
Edit:
GUIDs are almost as annoying as NULLs. It will lessen your alcohol intake if you avoid GUIDs whenever possible :).
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
January 22, 2014 at 2:24 pm
I STRONGLY recommend a 2-number scheme over GUIDs to ensure uniqueness for numbering across multiple servers. Bigint plus smallint is just 10 bytes vice the 16 you need for GUID. And if you only need <256 servers you can use a tinyint and save another value.
Another benefit to this is you can spread out your insert latch contention if you lead off with the serverid column in your clustered index. This assumes you even HAVE a latch contention issue. I think you won't. You are building a DW, which USUALLY means minimally logged BULK inserts of data from other machines, which should have very little problem with getting data in fast. I will assume you are going to be partitioned too, providing additional benefits for fast loading and ETL operations.
Oh, and losing a million numbers from a Sequence object isn't even a rounding error for a bigint. I think you can lose about 18 trillion million-row sets before you run out of numbers. 😀
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
January 22, 2014 at 2:53 pm
GilaMonster - I'm talking about the 2012 SEQUENCE object type. It's there, I wanna use it when practical. Sadly most of the existing systems can't/won't be changed so I can't get rid of the GUID values. I'm keeping an eye on new development tho.
I'll be sneaking sequence and tally tables onto the various servers as I get the chance. Then it's a matter of teaching people to use them when appropriate.
I like the server ID - identity value for the cross server uniqueness but it's too late to get that in place too. Another suggestion as we move forward.
As you can probably tell I'm here to create order out of chaos. Or clusters out of heaps. However you want to look at it.
Once again my profound thanks to everyone contributing to the discussion. It's great to see all the different ideas.
FYI - my boss said I could expense my booze and he would approve it but accounting wouldn't pay it. I try. 😛
January 23, 2014 at 2:46 am
JustMarie (1/22/2014)
GilaMonster - I'm talking about the 2012 SEQUENCE object type.
In that case, Kraig's comments on contention don't apply, he was talking about the practice of creating a table usually with an identity column which is then used to get sequential values.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 23, 2014 at 10:15 am
GilaMonster (1/23/2014)
JustMarie (1/22/2014)
GilaMonster - I'm talking about the 2012 SEQUENCE object type.In that case, Kraig's comments on contention don't apply, he was talking about the practice of creating a table usually with an identity column which is then used to get sequential values.
Aye, sorry about that. I'm still getting up to speed on 2012 and I missed which forum we were in. Apologies.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
January 23, 2014 at 10:27 am
Evil Kraig F (1/23/2014)
GilaMonster (1/23/2014)
JustMarie (1/22/2014)
GilaMonster - I'm talking about the 2012 SEQUENCE object type.In that case, Kraig's comments on contention don't apply, he was talking about the practice of creating a table usually with an identity column which is then used to get sequential values.
Aye, sorry about that. I'm still getting up to speed on 2012 and I missed which forum we were in. Apologies.
Actually I think you can still have hot-page-latch-contention if you are using 2012 Sequences to generate IDs that are inserted into a table where the ID is the Clustered Index. Everything will still go into the last page due to the (typical) monotonically-increasing-value of the sequence. This only occurs on very-high-frequency inserts though and as I noted before the OP stated this is a Data Warehouse application, for which very-high-frequency inserts are extremely unlikely unless it is a real-time-loaded DW.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
Viewing 13 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply