October 14, 2010 at 6:44 am
- Defined as IDENTITY because the column is guaranteed to be unique within the table
That's not even correct.
October 14, 2010 at 6:52 am
hallidayd (10/14/2010)
- Defined as IDENTITY because the column is guaranteed to be unique within the table
That's not even correct.
:blink: please do elaborate.
_____________________________________
Pablo (Paul) Berzukov
Author of Understanding Database Administration available at Amazon and other bookstores.
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.October 14, 2010 at 6:57 am
Sorry - don't want to take this off tack, but a column defined as IDENTITY is not "guaranteed to be unique within the table". Only a unique index\ constraint (or some hacky trigger\ UDF) can guarantee that.
October 14, 2010 at 7:02 am
Actually, I'm not taking it entirely off tack. My correction is born of a similar observation\ pet peeve as yours. Far too often I see primary key\ IDENTITY\ Autonumber (in the case of Access) used as though interchangeable. They are, of course, not.
October 14, 2010 at 7:05 am
hallidayd (10/14/2010)
- Defined as IDENTITY because the column is guaranteed to be unique within the table
That's not even correct.
Ouch, yes that is a bit of a glaring error. I've seen production databases with duplicate values in IDENTITY columns, which had nasty consequences. Feedback sent.
October 14, 2010 at 10:58 am
Craig Farrell (10/13/2010)
However, I don't work in the SQL standard. I work in MS SQL Server and T-SQL. Because of that, and because an index can give me performance where a constraint will not, but still takes up the same system space and engine mechanics as a declared index under the hood, I would prefer the index.
Good heavens, has someone introduced a major new performance reducing feature into the MS SQL Server optimiser so that it no longer uses the index that supports a UNIQUE constraint in the way it has used it for at least the last 10 years? That would be horrifying if true :sick: ! But I don't believe it 😀 .
Tom
October 14, 2010 at 11:43 am
PaulB-TheOneAndOnly (10/14/2010)
Still amazes me - yet it doesn't surprise me anymore - how this "surrogate Vs natural key debate" goes on and on.
It neither surprises nor amazes me - there are so many people talking nonsense about it that lots of other people will be confused. Then you get the relational "fundamentalists" versus the relational "purists" versus the ISO SQL Standard worshippers versus the real SQL in (pick the dbms of your choice) brigade each chipping in with their take on the issue - and given that these groups can't even agree a story on something as simple as NULLs it's to be expected they will not agree on something more complex like surrogate keys.
Please let me start by stating my position on the issue which is: "Use Natural keys whenever possible, use Surrogate keys whenever necessary"
That's something that should be dinned into every relational database developer and administrator and architect and designer before they are allowed to practise the DB trade - provided "possible" is interpreted as meaning "both possible and reasonable". It has strong support in all the camps I mentioned above (even a relational "fundamentalist" like Fabian Pascal takes pretty much that position, which is what makes it easy for a pragmatist like me to agree with him).
Having said that I think I've figured out why this "debate" gets particularly intense in the SQL Server universe - it has to be with clustered indexes on identity columns, please allow me to explain further.
Let me quote Microsoft recommendations on "Clustered Index Design Guidelines" where it reads...
With few exceptions, every table should have a clustered index defined on the column, or columns, that offer the following:
- Can be used for frequently used queries.
- Provide a high degree of uniqueness.
- Can be used in range queries.
. . .
Generally, you should define the clustered index key with as few columns as possible. Consider columns that have one or more of the following attributes:
- Are unique or contain many distinct values
- Are accessed sequentially
- Used frequently to sort the data retrieved from a table
- Defined as IDENTITY because the column is guaranteed to be unique within the table
I personally find these guidelines sound but I also think some DBA construct them as "you must have a clustered index on an identity column" missing a couple of key words in Microsoft's documentation like "with few exceptions" and "columns that have one or more of the following attributes".
My interpretation of the guidelines goes like: a clustered index is helpfull if chosen column is used in certain ways and is already defined in certain ways - which may (or may not) include the fact of already being of the identity datatype.
The guidelines aren't really sound; for example "can be used on range queries" really means "make things which are used close together live close together in the index" which, in an OLTP system, may have nothing to do with range queries (for example if I have a small numer of interactive users and a system where menus are constructed from text in a database, it's a good idea to cluster on language - if someone's using French menus he or she will probably continue to do so, and having the same text in 104 languages in cache just because it's been pulled in in French isn't useful: there's no range query involved there); the same text example blows away the "high degree of uniqueness" guideline (if it said "sets of columns" which had that property it would be sound, but it says "coumns" not "sets of columns"). The statement about identity columns is just plain false (duplicates can and do occur if uniqueness is not enforced by a constraint).
Hope this brings fresh air to the so called debate - I know it's almost impossible to end it 😀
I think it's about as likely to end as our longest thread.
Tom
October 14, 2010 at 1:22 pm
Tom.Thomson (10/14/2010)
"can be used on range queries" really means "make things which are used close together live close together in the index" which, in an OLTP system, may have nothing to do with range queries (for example if I have a small numer of interactive users and a system where menus are constructed from text in a database, it's a good idea to cluster on language - if someone's using French menus he or she will probably continue to do so, and having the same text in 104 languages in cache just because it's been pulled in in French isn't useful: there's no range query involved there); the same text example blows away the "high degree of uniqueness" guideline (if it said "sets of columns" which had that property it would be sound, but it says "coumns" not "sets of columns"). The statement about identity columns is just plain false (duplicates can and do occur if uniqueness is not enforced by a constraint).
I have made this point (far less well) several times on another forum. I have promised several times to write some proof but never have. I think this is the first time I have read someone express the same opinion\truth\ignorant nonsense**
** delete according to your brigade allegiance.
October 14, 2010 at 1:36 pm
Tom.Thomson (10/14/2010)
The guidelines aren't really sound; for example "can be used on range queries" really means "make things which are used close together live close together in the index" which, in an OLTP system, may have nothing to do with range queries
I have a part-written blog post on why clustered indexes are NOT the best indexes for range queries.
Must finish. Some day.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
October 14, 2010 at 1:39 pm
GilaMonster (10/14/2010)
I have a part-written blog post on why clustered indexes are NOT the best indexes for range queries.
Must finish. Some day.
Soon please!
Tom
October 14, 2010 at 1:42 pm
This post has gotten very long but a very informative exchange of information. 😎
For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
October 14, 2010 at 1:55 pm
Really? Three pages barely counts as a warm up for a surrogate Vs natural key thread 😉
October 14, 2010 at 2:56 pm
David Portas (10/14/2010)
I don't understand your claim that an index alone provides better performance than a uniqueness constraint. Can you back that up with an example? I'm pretty sure you'll find they are always exactly the same, in which case the constraint still has all the advantages I mentioned before.
And:
Tom.Thomson
Good heavens, has someone introduced a major new performance reducing feature into the MS SQL Server optimiser so that it no longer uses the index that supports a UNIQUE constraint in the way it has used it for at least the last 10 years?
I'm going to duck out the discussion in general, but since I got called on this directly... You're absolutely correct. I've been carrying a really bad myth around with me for years and never got called on it because the conversation never came up before on that specific aspect.
Thank you both.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
October 14, 2010 at 3:02 pm
Craig Farrell (10/14/2010)
I've been carrying a really bad myth around with me for years and never got called on it because the conversation never came up before on that specific aspect.
Queue up another blog post for me to write sometime....
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
October 14, 2010 at 3:04 pm
GilaMonster (10/14/2010)
Craig Farrell (10/14/2010)
I've been carrying a really bad myth around with me for years and never got called on it because the conversation never came up before on that specific aspect.Queue up another blog post for me to write sometime....
Heh, please do. I think I have about 20 or so older clients that I need to send some corrected code... :facepalm:
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
Viewing 15 posts - 16 through 30 (of 38 total)
You must be logged in to reply to this topic. Login to reply