January 5, 2006 at 12:55 pm
In my opinion CELKO's approach is a purist view of how the database should be build and has nothing to do with real life.
For example you have a database of clients and their employees.
Q: What whould be the natural index for an employee table.
A: There is none because i real life people change jobs (so any demographics will not work), in many cases the info you could use for PK is missing (SSN for example, because this is not you business what my SSN is - so if you want to do business with me forget about my SSN) and if you try to pull multiple columns together to create a natural index - GOOD LUCK.
So if you want to know my opinion - I always use IDENTITY as a primary key for linking tables internally because it is faster then any 'natural keys' and gives you a way out if your natural primary key becames 'naturally' non-unique.
PS. As in life there are exceptions to always and never.
---------------------------------------------
[font="Verdana"]Nothing is impossible.
It is just a matter of time and money.[/font]
January 5, 2006 at 1:10 pm
Another consideration is the type of database application. Is it OLTP ? Is it a data warehouse ? Dimensionally modeled ?
Consider dimensionally modeled warehouses or marts - you have a fact table with 10's of millions of rows, linked to many dimension tables. Each Dimension most likely has a 'natural key' (eg product number), however, you need to track slowly changing Dimension attributes over time so a 'natural key' will repeat - so it can't be the primary key. So do you make the primary key the natural key plus the effective date ? Now what size is your primary key ? And you need to reference that from your huge fact table. Try doing that with 10 dimensions in a 250 million row fact table and what size are the collective foreign keys in your fact table ? How is your query performance ? I'll take 4 or 8 byte identity as a surrogate key for each dimension, please.
January 5, 2006 at 1:39 pm
There's also a maximum "throughput" on INSERTs, but I forget what that number is. Far greater than what most of us see, though.
K. Brian Kelley
@kbriankelley
January 6, 2006 at 9:33 am
I posted this on another thread. I am just copying it in here because it is also relevant to this thread.
An identity can make a great primary key for multiple reasons
1. It is great for joining. I would hate to have a 3 column composite key that was migrating all over my db. The SQL would be horrible to write and it would be slow.
2. It is better for indexing than a string or composite key.
3. Cascading the update across the whole db is slow if part of the natural key changes.
If there is a good natural key, by all means use it. If not, use an identity. It does not void referential integrity because you will not end up with orphaned data.
One more point is that let's say you could use two of the existing columns as a PK. They are both strings, so it would be slow to join them and a waste of space to migrate them to all of your child tables. Create a PK as an identity then create an index or constraint on those two columns that enforces uniqueness as an alternate key.
This is obviously a point many people argue on both sides of and I just wanted to post so people could see valid cases and uses of an identity as a pk.
Here is another article about this: http://www.aspfaq.com/show.asp?id=2504
January 6, 2006 at 11:03 am
Joe Celko is a purist and while technically correct (probably) he has a "forceful" way of expressing his views.
Viewing 5 posts - 16 through 19 (of 19 total)
You must be logged in to reply to this topic. Login to reply