September 14, 2012 at 4:26 pm
This is an old programming trick. When New York State used Social Security Numbers for student ids, they forced students to get an SSN or they assigned a dummy number with 9's or 6'as in impossible places in the string.
That's a horrible idea. SSNs can change, so I suggest avoiding them for keys.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
September 14, 2012 at 5:41 pm
Joe,
Can we reset the argument here about what we mean about a primary key and foreign key. I have read the intro sections to a couple of your books on Amazon to understand where you are coming from with some of this.
A Primary key is a field or combination of fields that will uniquely identify a row in a table.
A Foreign key is those same fields in another table that allow me to get back to the unque record.
granted that from a pure RDBMS design, if you have a natural key, you should use it in the related tables, but to be honest, I can't be bothered to carry a seven part key between tables, when it is easier to program, read and maintain if I stick an idientity column in the main table and use that as the FK. yes, I have to be careful if I need to rebuild the tables that I don't lose the referential integrity.
Often pure design does not stand the business demands of a changing application.
Example.
CREATE TABLE ProductGroup
(
GroupCode nvarchar(5),
Description nvarchar(25),
)
CREATE TABLE Product
(
ProductCode nvarchar(10)
ProductGroupCode nvarchar(5)
Description(25)
)
INSERT INTO ProductGroup (GroupCode,Description) VALUES ('PAIN', 'Pain Relief')
INSERT INTO ProductGroup (GroupCode,Description) VALUES ('BREAD', 'Bread')
INSERT INTO Product (ProductCode,ProductGroupCode,Description) VALUES('ABC123','PAIN','Paracetemol' )
INSERT INTO Product (ProductCode,ProductGroupCode,Description) VALUES('XYZ987','BREAD','Bagel' )
Suddently we open a store in Paris, selling the same products. I now need to internationalise the database so the new structure looks like this
CREATE table Country
(
ISOCode nvarchar (3),
CountryName nvarchar (20)
)
Create Table ProductGroupCountry
(
ProductGroupCode nvarchar(5),
CountryCode nvarchar(3),
Description nvarchar(20)
)
INSERT INTO ProductGroupCountry (ProductGroupCode,CountryCode,Description) ('PAIN','UK','Pain Relief')
INSERT INTO ProductGroupCountry (ProductGroupCode,CountryCode,Description) ('PAIN','FR','Peine')
INSERT INTO ProductGroupCountry (ProductGroupCode,CountryCode,Description) ('Bread','UK','Bread Products')
INSERT INTO ProductGroupCountry (ProductGroupCode,CountryCode,Description) (Bread,'FR','Pain')
Now all the french reports show the baguettes as being in the Pain product group, but in the database the productgroup is Bread and PAIN shows something completely different.
The maintenance and understanding of the data would have been easier if the baguettes had been in product group 2 - an agnostic foreign key with no direct meaning which is translated for the user through the relational tables.
I get the point about identity chaining, but will you concede that theoretical design sometimes brings with it complexities that can be avoided by taking a pragmatic approach.
September 14, 2012 at 7:28 pm
About DUNS...
Having no DUNS means for you "doing business with criminal organizations"? Do you really think that local Fish-&-Chips or Pizza shop and local ice-cream man on a bus with bells all have DUNS? No they not. Why? Because they are all criminals!
"create a dummy place holder with alpha characters, and give them X days to get a DUNS."
I did read it as suggestion to update PK, but then found even brighter idea - you drop these "temporary" records and create them with right Number (same you do with all related records) ...
CELKO (9/14/2012)
Really great! DNA sequence for PK!
Yes, it is coming in about 5 years. ....
And for now stick with emails?
It is highly unlikely that Amazon, using Oracle for their core RDBMS will use email as PK. They may and they do validate it to be unique but not a PK of an account. Would you recommend changing valid PK? Not a temporary one as you have suggested later for DUNS, but absolutely a valid one? I hope not. But Amazon allows you to change it! Just go to "Your Account", "Settings", "Change Name, E-mail Address, or Password"... And that's it. Don't try to tell us that they delete everything related to the old entry and recreate it all again with new "valid" PK
However, Amazon is really bad example here anyway.
At the end, we are talking about Person entity, which can be used anywhere, eg. in database which have noting to do with e-commerce. For example in Hospital, will you create fake email for a new-born baby and use it as PK, then wait for few years when baby will grow up enough to have real one? Or you will wait 5 years to get DNA technology? How will it work if the doctor (from "without boarders") goes to some Africa or Amazon jungles to help some remote villagers? Will you expect them to have DNA on a memory stick too?
You can comfort yourself with decades of royalties and sales figures, BUT
you are almost making yourself the laughing stock here.
September 14, 2012 at 8:48 pm
CELKO (9/14/2012)
Thank you for proving (once more) that you can't take the theory to the practice in a correct way. How can you suggest to use a random number controlled by someone else that you can't validate as a valid key for companies?
You can call D&B to validate it directly, but since it is a universal standard I usually call their banks, creditors and any government agencies that require it. It is like a VIN for a Vehicle or an ISBN for a Book.
That's even worse than an IDENTITY column which, by the way, you can control and predict.
When multiple users hit the same table, how do I know who got there first to get a particular IDENTITY? SQL is a set oriented language, so what happens when I insert a set?"INSERT INTO Foobar VALUES (a),(b),(c);" will be assigned its IDENTITY values depending on the internal state of the machine, the current indexing, etc.
The major point of the IDENTITY prosperity is that it takes care of itself and you do not have control or knowledge.
How are DUNS part of the LOGICAL data model? and how do they differ from another proprietary, non-relational auto-numbering system?
Re-read the parts about a trusted source.
I'm sure you are aware the problems on using emails as a key for people. For instance, at the moment I have 5 different email accounts, all valid. Would I be 5 different people? What about people with no email address? or even better, what about services as 10 minute mail? Your design is a total failure and you have to do it again.
Then how does Amazon, et al stay in business, with this "total failure"? I do not need to track precisely one human body; I need to track the role played by the entity within my data model. Haven't you seen corporate email addresses used for this? "purchasing_department@Umbrella_Corp.com" is often used today, but the message has to have some more validations in the message.
There's no real natural key that can be used internationally for these examples without real problems. You might know as well that Primary Keys are different from Clustered Indexes, the way to chose them will be different even if they tend to be the same.
There is always a natural key, but there is not always a usable key :-). That is why we have ISO.
Oh, yes, VINs, good idea except it is only unique during a 30 year time span. That would be a great choice for a primary key to represent cars from the 1960's to now.
September 14, 2012 at 8:53 pm
CELKO (9/14/2012)
That's a horrible idea. SSNs can change, so I suggest avoiding them for keys.
It was worse than that! First, the damn thing does not have a check digit, so you have honest transpose errors at a 3-5% rate.
Thanks to illegals (removed unnecessary and potentially inaccurate statement). there is a huge duplication rate because everyone bought the same card from their coyote in states like California and Texas.
It is now illegal to use SSN for identifiers in many cases. I do not know what NY state is doing these days; GA Tech used SSN when I went there, but has a "buzz number" now. It makes it a screaming pain to get a transcript because the old system is off-line.
Yes, and the Social Security Act that created the SSN actully stated that this number WAS NOT to be used for identification purposes!
September 15, 2012 at 8:02 am
CELKO (9/15/2012)
Oh, yes, VINs, good idea except it is only unique during a 30 year time span. That would be a great choice for a primary key to represent cars from the 1960's to now.
Wrong; it got fixed in 2010 by expansion. Don't you read the DETROIT NEWS or the American Automotive Engineers newsletters? It is right there next to your copies of TRASH (the trade magazine of recycling) and Concrete Dealer's News ;-). Hey, I scan ISO documents about ear tags for sheep.
They kept the 17 character format to avoid mechanical problems, but the model year is now good until 2039. The plans at this point are to go to 18, 19 or 20 characters before then. The idea is that we will have better tech and will not have to stamp VINs into metal. We laser etch diamonds today, so why not a car? And there is talk of RFID, etc.
The history of the SSN is a warning about bad encodings. Not only is it commonly used as an identifier, they killed validation programs over the years only to replace them with another. The error rate is between 5-8% and no one is sure. My favorite story is about a wallet sold in LIFE magazine in the 1950's that came with dummy inserts. There was a photo of an imaginary family (keep it, they looked better than your real one) and the artist's own SS card. This was one of the numbers they had to retire because of dups.
No, I don't read read the Detroit News, nor do I read American Automotive Engineers newsletters. I have no reason to read these. As for TRASH and Concrete Dealer's News, I don't read those either. I guess you are the only regular reader of those.
September 17, 2012 at 4:57 am
CELKO (9/14/2012)
At the end, we are talking about Person entity, which can be used anywhere you are almost making yourself the laughing stock here.
You might believe in Kabbalah numbers, but as I have said, I like the first law of logic. The Law of Identity: "A is A", wrongly attributed Aristotle or Ayn Rand, depending on your age or education. "To be is to be something in particular; to nothing in particular or to be everything in general is to be nothing at all"
Person is too general a class to have an identity.
You read too many Kabbalah books...
Ok, I guess it has nothing to do with my "age" of education. As none of the above is right. As I remember, the first reference to this "law" was found in Plato's writings about Socrates dialogue with Theaetetus about colour and music or sound?, cannot check, wiki is down. Aristotle, almost 100 years later, in his Metaphysics book did explore this subject, but as I'm aware, he never claimed that "A = A" as it has nothing to do with "fact that a thing is itself," Aristotle never claimed "a thing is a thing", at least explicitly. Really strange application of philosophy classics, where the logic here?
Person is not always too general. It can be very specific. Person can specify an Individual Human. Let's try this model:
Let say CRM system which may have Users (Employees of the company of CRM system) and Client Contacts. The business nature is a such that the same Person in his working life can be active contact of one or multiple client companies (let say chairman of few enterprises), and also, can became (or was) the Employee/User sometimes.
So, the Personal information you would have can be quite minimal: title & name - no private emails, no private phones, addresses and other private details. The person can be from almost any country in the world (so no local SSN numbers may be always available, as many countries have no Social Security at all!). However in CRM system, you would like to track and reflect the history for the Person as a Human, you want to see that the Man used to be Contact with Company A, then he was Employee of the CRM system company, then left and became contact for Company B and Company C.
To do so, you would want to have the Person details as separate entity. So, Contact and Employee entities can refer to the same Person record
So, what to use as Natural Key for Person here? Email? - not possible - he may no t give you the private email and business one can change to often. Phone? - the same as email. And no one, in their right mind, including Amazon, ever use such things for Primary Keys in RDBMS. You will need to come up with something artificial. Whatever you will come up with, alpha-numeric code or random numbers aka DUNE or Credit Card numbers with check digits or without inside, will have no much difference to using simple IDENTITY, as they still will not make Natural Key! But you may wait for 15 years, the Good has a project running, he will uniquely bar-code everyone for J.Celko convenience.
For example in Hospital, will you create fake email for a new-born baby and use it as PK, then wait for few years when baby will grow up enough to have real one?
In the US, the baby gets a Social Security Number at birth now!
Try Google Map, you will find, to your amusement, that there are just few more countries on the planet Earth than US.
Or
Do you suggest designing non-portable system? What about if Mitt wins the next one? He may easily shut SS down... Oops, he may replace it with Offshore-Tax-Residence Number for some and Looser Number for others, so we will just have composite PK then :hehe:
Or you will wait 5 years to get DNA technology? How will it work if the doctor (from "without boarders") goes to some Africa or Amazon jungles to help some remote villagers? Will you expect them to have DNA on a memory stick too?
Google up the ATM machines for illiterates in India that use fingerprint scanners. The DNA chips were being developed in part for "remote villagers" that cannot use keypads. Do you doubt that within my lifetime I will see Smartphone fingerprint apps?
No, I don't doubt that. I also allow myself to think that in your lifetime you may even see iIron, iScissors and artificial iBrains all powered by MS Windows operation system with nice name of Gugle.
Also you may see the Apple changed to Pear and many other wonderful changes...
But I kind of in doubt of your ability to reason your answers, including the above one. I did also really liked the suggestion of Amazon using emails as PK for customers in their RDBMS. :w00t:
You can comfort yourself with decades of royalties and sales figures, BUT you are almost making yourself the laughing stock here.
And you have made yourself into a Troll 😛 That is really hard to do when you attack someone as abrasive as me.
I don't feel that you really is abrasive as such... I do think that there are some other words around which are bit more appropriate and descriptive. :hehe:
I wonder why you didn't explored the discussion about "Amazon using emails as keys" after I did mentioned, that you can easily change it on its website... 😛
Sorry, my language can also be quite stinging, but I don't think it qualifies me to be a Troll more then yours. 😉
September 17, 2012 at 8:08 am
CELKO (9/14/2012)
First, email is so obviously not a natural key for a human being that I'm always shocked when people try to use it that way. It's fine as a login string (username), in many cases, but for anything else it's worse than using names. And the problem with using them as logins is that this helps promote security-reuse, which is one of the biggest keys to identity theft. So it's not just bad database design, it's also a security hole, and assists criminals.
That is why you add security after the email address gets you into the database. Has Amazon failed yet?
Second, DUNS numbers are no more a natural key for a business than an incremented ID is. It's just a random number assigned by someone else instead of an incremental number assigned by your database engine. Arguing that this makes it superior somehow, makes zero sense. That's even ignoring the fact that millions of businesses don't have or need DUNS numbers.
Replace "DUNS" with "VIN" and you will see your error. I can verify the VIN at the DMV, my insurance company, etc. I can verify the DUNS at government agencies and banks. That is the nice part of an industry standard.
Both concepts also assume that all data will be for current-era entities. DUNS numbers only go back to 1963. E-mail for common use, even in tech-heavy countries like the US or Japan, only goes back about 10 years, maybe 15. Human activity that might need to be tracked in a database goes back considerably further than that.
I worked for the Getty Museum; I am more aware of calendars and old stuff than I wanted to be 🙂 Why do you think this is a problem for commercial users?
I could list hundreds of more reasons those are poorly chosen PKs. They have their uses, but claiming universality to them is just ignorant.
Hey, I am the guy who does not believe in Kaballah numbers! Read the parts about particular identifiers for particular entities, trusted sources, etc. I am not claiming universality. These are the ones I use because I am commercial database guy on the Internet.
The IDENTITY, GUID, etc people think their magical numbers are universal. They tag squids, automobiles and Lady Gaga, without looking for an appropriate industry standard, or designing an encoding that will do the job. They are building pointer chain databases in SQL.
Joe, you need to look up "ad hominem" and then have your wife beat you with her teaching stick until you realize that "ad hominem" != "Joe wins". 😛
Yes, DUNS allows for external identity verification. I'm not advocating "don't use DUNS numbers in your database". You're reading it that way, but that's you reading words I never wrote.
The point is, a natural primary key is something inherent to the entity being described. For example, if you want to have a table of colors, your PK could consist of a few physical values that describe the way it reflects/refracts/absorbs light. Physics and photographers have already done that work for you. It's physically inherent to what we mean by the word "color". Even if different people see the color different ways, which is significantly common, the database still has the data in a universal way. Even an alien civilization that uses a whole different visible spectrum or operates on senses other than sight, could use that table. A DUNS number is NOT inherent to the business, any more than a driver's license number is inherent to a person. There is no datum within a human scope that fits the definition of "uniquely identifies a business and is a native descriptor of the business". There is no natural key for human groups, business or otherwise. None.
The advantages to DUNS numbers that you cite, apply equally to passport numbers. They can be verified by government offices, they're internationally recognized, etc. But I bet you wouldn't consider them a good idea for being the primary means of identifying a person.
The primary objection is that not every business has one, nor does every business need one.
What you've done is found a surrogate key that you like better than IDs, and has a few practical advantages (and a few serious disadvantages), you've gotten excited about it, and you've become an advocate of it. It's still a surrogate key. It's still an arbitrarily assigned number. It just feels more exciting to you, that's all.
And you're right that DUNS and VIN are comparable. Highly similar. They have similar drawbacks: DUNS goes back to 1953, VIN to 1954, businesses/vehicles outside that time-dimension will break your data if you need to accommodate them; they're arbitrary in that they don't actually describe the object being identified; they aren't human-friendly and are prone to data transmission errors through voice-over-phone and similar methodologies. DUNS is weaker than VIN because VINs are mandatory (in the US and many other countries), and have been for quite some time (1981 in the US), while DUNS numbers are not legally mandated. DUNS is stronger than VIN in that DUNS is a single international standard, while VINS vary from country to country to a certain extent. DUNS is weaker in that it's dependent on a single company (single point of failure), while VINs are not.
And, no, I probably wouldn't use VIN as a primary key for a cars database. Last Saturday (2 days ago), I was at a classic cars show. There were a significant number of vehicles there that predated the VIN standard. I see that show every month (I volunteer at a booth), so I'm more than usually aware of the fact that there are tens of thousands of vehicles in Florida alone that can't be put in your database. You'd have to put NULL in the primary key for them, based on your design = broken design.
And, since you seem to have a tendency to assume mendacity/criminality on the part of anyone who disagrees with you, no, these are not stolen vehicles.
As for your question about Amazon failing security-wise: Yes. Think "recent, heavily published hacking case involving Amazon, Google, human engineering, and an IT professional having his Apple products bricked". Here's the story: http://www.wired.com/gadgetlab/2012/08/apple-amazon-mat-honan-hacking/
There, I think I've covered every point you raised.
And you didn't respond to the basic assertion about e-mail, which is its horrible weakness as a primary key for human beings. I still maintain that using it that way, in anything but a database of e-mail addresses (would be useful for a spam shop), is just incredibly ignorant.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
September 17, 2012 at 8:39 am
...There were a significant number of vehicles there that predated the VIN standard. I see that show every month (I volunteer at a booth), so I'm more than usually aware of the fact that there are tens of thousands of vehicles in Florida alone that can't be put in your database. You'd have to put NULL in the primary key for them, based on your design = broken design.
...
Joe suggested, in one of his previous posts in this thread, that he will create "temporary" record with a "temporary" key then wait until the real one will be established. As he does hope for a long lifespan, he may not only see ATMs in remote India villages which will authenticate villagers via DNA saved on memory-sticks (or body embedded chips) but also that US authorities will re-issue VINs for every car ever been made in US or ever stepped on American road...:hehe:
...
And you didn't respond to the basic assertion about e-mail, which is its horrible weakness as a primary key for human beings. I still maintain that using it that way, in anything but a database of e-mail addresses (would be useful for a spam shop), is just incredibly ignorant.
I don't think he will respond. Amazon allows email used for login to be changed! So, there are only three choices:
1. Amazon uses Email as PK, but allows updating it, which make them J.Celko certifiable in regards of choosing the key, but crap designers as PK should not be changed
2. Amazon uses Email as PK and J.Celko technique to resolve the problem when it need to change. They treat all data related to it as "temporary"! So when user changes its "temporary" email to the "real" all "temporary" staff is cleaned out and re-established as a proper permanent data. With a little caveat... From this point Amazon starts to treat proper permanent data as a "temporary" one to allow user to chose even more "real" or "proper" email.
3. Amazon doesn't use Email as PK.
Now what is your choice? See, Joe cannot choose #3 so he has nothing to reply about it....
😉
September 17, 2012 at 10:32 am
And you didn't respond to the basic assertion about e-mail, which is its horrible weakness as a primary key for human beings. I still maintain that using it that way, in anything but a database of e-mail addresses (would be useful for a spam shop), is just incredibly ignorant.
But then, why does it work in e-commerce? Got a better one? I seem to be using other credentials (MS, Google, LinkIn, et al) more and more.
You are lying or trolling here!
Not MS, not Google not LinkId and not, your first example, Amazon - no one except may be you and few very inexperienced RDBMS designers/developers would try to treat EMAIL address as Natural Key suitable for PK purpose.
As I have pointed at least couple times here: You can go onto Amazon and change your login Email easily - it will not create new account for you and all your other details eg. order history will be intact.
September 17, 2012 at 10:45 am
I still doubt that ecommerce sites use your email address as a primary key. This is a volitale attribute subject to change, and isn't one I would use. I would us a surrogate id as a primary key as the user (customer) would not see it or be able to change it. This would allow them to change their email address(es) whenever they want without losing any information.
September 17, 2012 at 10:58 am
Lynn Pettis (9/17/2012)
I still doubt that ecommerce sites use your email address as a primary key. This is a volitale attribute subject to change, and isn't one I would use. I would us a surrogate id as a primary key as the user (customer) would not see it or be able to change it. This would allow them to change their email address(es) whenever they want without losing any information.
Agree. Amazon (rightfully) hides it from their users, but I'm certain they have a customer number they assign in their db tables.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
September 17, 2012 at 11:05 am
CELKO (9/17/2012)
Wikipedia says that as of 2011, more than 2.2 billion people – nearly a third of Earth's population — have email. I seldom work with a target audience larger than that, do you?
No ... but irrelevant.
The real q is: will I need to work with a person who does not have an email address?
And that then becomes a business q: are you willing to refuse business to anyone who does not have, or will not provide, an email adddress? Many companies actually do this now.
For a customer, I would still rather have my own internal id that I know will never change. I can keep their email address(es -- if I need a history) w/o making them any type of permanent key (although of course I may index them).
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
September 17, 2012 at 11:08 am
CELKO (9/17/2012)
Both concepts also assume that all data will be for current-era entities. DUNS numbers only go back to 1963. E-mail for common use, even in tech-heavy countries like the US or Japan, only goes back about 10 years, maybe 15. Human activity that might need to be tracked in a database goes back considerably further than that.
Agreed. I had those problems with a brief consulting job at the Getty Museum. This is the troll's “Yes, but what if..” game that continues to raise objections to anything proposed. The usual techniques are ask for something very specialized (“fingerprints? What about people without hands?”) or overly generalized (“how do you put a RFID on every insect on earth? RFID will never work!”) or absurdity (“What about Martians?”) or never proposing an alternative.
I seldom work with companies that went out of business before 1963; do you? Wikipedia says that as of 2011, more than 2.2 billion people – nearly a third of Earth's population — have email. I seldom work with a target audience larger than that, do you?
The point is, a natural primary key is something inherent to the entity being described.
Fine. I like things I can measure when I physically have an entity. But these are usually non-key attributes, like weight, color, etc. Keys are a bit different.
For example, if you want to have a table of colors, your PK could consist of a few physical values that describe the way it reflects/refracts/absorbs light.
A color is a non-key attribute; it is a measurement. And there are a ton of color scales. If you are in the printing industry, you use Pantone; computer people use RGB; other industries use teh Land color number; etc.
The advantages to DUNS numbers that you cite, apply equally to passport numbers. They can be verified by government offices, they're internationally recognized, etc. But I bet you wouldn't consider them a good idea for being the primary means of identifying a person.
LOL! I carry my passport with me at all times, have a birth certificate card in my wallet, Social Security & Medicaid cards, a Texas Driver License and one bank card.
The primary objection is that not every business has one (DUNS), nor does every business need one. What you've done is found a surrogate key that you like better than IDs, and has a few practical advantages (and a few serious disadvantages), you've gotten excited about it, and you've become an advocate of it. It's still a surrogate key. It's still an arbitrarily assigned number. It just feels more exciting to you, that's all.
Doesn't anyone read Dr. Codd any more? DUNS is not a surrogate key by definition. You might call the DUNS an artificial key that happens to be global. And yes, I do get excited about industry standards I can use any place on Earth. I think that is much better than the count of physical insertion attempts to one table, in one database, on one machine in one particular SQL product. I see this as more than a “personal preference” ; I fundamental differences.
And you're right that DUNS and VIN are comparable. Highly similar. They have similar drawbacks: DUNS goes back to 1953, VIN to 1954, businesses/vehicles outside that time-dimension will break your data if you need to accommodate them; ..
The troll argument again! Go to extremes!
they're arbitrary in that they don't actually describe the object being identified;
Keys have to identify the entity, not describe it. Big difference. Does “the City of New York” (the legal name) describe it? No, it identifies the entity. In fact, using the full legal name has implications about how I am using the data.
And, no, I probably wouldn't use VIN as a primary key for a cars database. Last Saturday (2 days ago), I was at a classic cars show. There were a significant number of vehicles there that predated the VIN standard. I see that show every month (I volunteer at a booth), so I'm more than usually aware of the fact that there are tens of thousands of vehicles in Florida alone that can't be put in your database. You'd have to put NULL in the primary key for them, based on your design = broken design.
So just how do these vehicles get registered in Florida? Get insurance? In some states they are regarded as a special legal classification and they are kept in another database. Mixed serial numbers are a real problem. I worked on the auto tag system for Georgia decades ago
And you didn't respond to the basic assertion about e-mail, which is its horrible weakness as a primary key for human beings. I still maintain that using it that way, in anything but a database of e-mail addresses (would be useful for a spam shop), is just incredibly ignorant.
But then, why does it work in e-commerce? Got a better one? I seem to be using other credentials (MS, Google, LinkIn, et al) more and more.
Joe, I'm surprised. You just made the case for IDENTITY with this quote:
they're arbitrary in that they don't actually describe the object being identified;
Keys have to identify the entity, not describe it. Big difference. Does “the City of New York” (the legal name) describe it? No, it identifies the entity. In fact, using the full legal name has implications about how I am using the data.
2 points here. First, What does an Identity value do other than uniquely identify an entity? Second, anyone who uses a string of "the City of New York" as a PK would be foolish (Your wife's teaching stick ready to whack you?)... What about "New York City" or "City of New York" or even "The City of New York" or "New York" (part of "New York, New York"). If your PK is a string value like that, you have to take into account context, capitalization, and variations in the spelling/grammar... most of which are based on geographic and environmental conditions outside the scope of your database.
September 17, 2012 at 11:15 am
...
I never worked for Amazon
...
I don't think you have ever worked for any e-commerce company, otherwise you would know that none uses email as PK.
And it is fine to change a Primary Key, but all references to it must reflect the change. For example, in the US, we moved from the old ISBN-10 to the ISBN-13 in the book industry a few years ago. It was a bitch, but more for the front end programmers who had to change screens and printed forms than for the database guys. When we were done, books became part of the EAN system, so POS systems could handle them without special programming...
Based on this one-off exercise you are advocating design which will have this feature of changing PK in-built into system? I don't know how you managed to get your SQL books published...:w00t:
I can give you one other, very nice example for which you can try to come up with nice Natural Key: Order entity/table. Yes, just a simple and very common part of e-commerce database table of Customer Orders.
Check your own Amazon order history first to see what they are using as Order Number, before suggesting anything more stupid than Emails for Customer Accounts (originally "Persons")...
Good Luck!
Viewing 15 posts - 46 through 60 (of 125 total)
You must be logged in to reply to this topic. Login to reply