March 17, 2006 at 9:28 am
> It is perfectly acceptable to have the state name in the person record, instead of a stateid
Will you have list of states in your database?
Will you allow to enter free text as name of state?
Who gonna fix typos in the field? Or it's perfectly cceptble to have couple of customers born in Texus and Calofornia?
And finally, what users suppose to enter as "State of Birth" for immigrants and where they suppose to store their Place of Birth and description what is this place (Province, State, Region, etc.)?
_____________
Code for TallyGenerator
March 17, 2006 at 9:29 am
Mark,
I was just making a point about storing attributes with the entity. I believe you agree that the 3 table solution is way overnormalized
I would probably declare person record as:
create table Person(PersonID int, BirthStateCode char(2))
But frequently I use the 2 table solution because the State table can have additional attributes besides Name such as 2 letter code, capital city, region of the country etc...
March 17, 2006 at 9:46 am
What I will "usually" do for states in an application (again, it depends), is have a state table:
create table State(name char(2))
and put the state in the record itself (to use Jeff's example):
create table Person(PersonID int, State char(2))
The application will (usually) have the referential integrity, as the user can only choose from a list populated from State table. But the Person table does not need to maintain a useless link to State table, there is (usually) no other information about states that the system cares about. If there is addiitional information added at some time later, the existing value can now becomes a foreign key. But there is little use in having SQL Server verify the state exists in the State table every time that I update something else on the Person table (as SQL Server verifies all foreign key relationtionships on every update).
So, yes, Jeff, I certainly agree that the 3 table solution is way overnormalized. I also believe that the having wasteful int ID's is overnormalized. If there is one value to the foreign table, use it in the entity table.
Sergiy:
Will you have list of states in your database?
Yes
Will you allow to enter free text as name of state?
No.
Who gonna fix typos in the field? Or it's perfectly cceptble to have couple of customers born in Texus and Calofornia?
n/a
Mark
March 17, 2006 at 8:23 pm
So, you do have separate table for states anyway.
And for referential integrity you must have FK constraint to State in Person table.
Because you limit capabilities of your system only to 50 states you don't need int StateId, tinyint (1 byte) is more than enough.
Small number of states will allow server to cash it in memory and reuse the values in every select - good for performance.
So, what you gain from your approach?
And there is no need to mention about total disaster for you system with char(2) key approach after installation on SQL Server with Spanish default collation.
P.S. You did not answer how do you store information about place of birth for immigrants? Or you serve only those who was "born in the USA"?
_____________
Code for TallyGenerator
March 18, 2006 at 7:52 am
Why are you focusing on an example that may have been poorly constucted for an truely international system. However, there are system in place that only deal with with people that only live in the US. Would it make you happier if we went back and edited every post made with BirthState to BirthCountry and a Country table? Maybe then you could answer the question about storing attributes of an entity.
March 18, 2006 at 2:22 pm
Because char(2) definitely not enouph for country.
What will code "AR" say to you about name of the country?
And if a person lives in USA it does not mean this person was born in USA.
_____________
Code for TallyGenerator
March 20, 2006 at 4:01 am
Excuse me to interrupt your discussion, but earlier you where asking whether somebody could imagine a well normalized table with more than 10 columns?
Hmmm. As we all know, a db which is in clean 3NF is elegant, but extremly slow. So, I would say, I have a table like that: A product property table. It contains the usual ID, SN, Description, stuff and about 30 to 40 physical properties like length, diameter, volts, watts etc. They are all unique to this product. Of course, some products share a voltage or have a similar power consumption. How would I improve my DB-Design by extracting this 'redundant' information? They COULD be put into several tables to please 3NF-Enthusiasts, but *what for*? I need all the columns anyway almost all the time. They are properties solely for this particular product and only 'by accident' they do share some combinations of width, diameter and luminescence.
When I read your arguments, I believe that you are not really interested in tuning a database to maximum performance, but rather to an elegant award winning design.
Finally, I think you are right in every aspect, as well as your 'oponent', but imho not connected to real life situations...
March 20, 2006 at 6:48 am
Did you actually check it? Did some tests?
If this table is more comfortable to read for you, it does not mean it always faster to operate for SQL Server.
_____________
Code for TallyGenerator
March 20, 2006 at 7:15 am
To be honest with you I doubt a join over oodles of tables beats a single select, even if the join is via a clustered index. Naturally it ALLWAYS needs more overhead (I am sure you are aware of the internals of MSSQL).
Only on connections with a narrow bandwidth you could benefit of clean 3NF...
Concerning the readability, I would even create a view for my personal pleasure...
Imho there will always be a compromise between clean 3NF and performance. You just can't have both.
To make it short: I can imagine a few (not a lot) of wide tables carrying more than 10-20 columns. For simplicity I would allow this for developers.
But the discussion here only started with this statement of yours, the rest (especially coding US-states into tables) is 100% correct and I would never directly code information into a table like that: Suppose USA takes over the world (or any other extraterrestical areas), the 2-character state coding would immediately bomb out. Just imagine renaming the states.
March 20, 2006 at 7:57 am
You better test before you say it.
Especially when you need to update 1 value in a row.
I also cannot imagine screen with 40 values on it. So, you don't select the whole record, only part of it. In this case select from 2 tables will be faster.
And if you need to search for a value you need to have indexes. What about 40 indexes on one table? Comparing to 3 on two tables?
_____________
Code for TallyGenerator
March 20, 2006 at 8:28 am
1. I tested it. joins do take time.
2. These values hardly ever get updated. They are set and done: New product property=> new entry.
3. You can't imagine tables with more than 40 values on it. Nor can I. But technicians do. They LOVE wide tables which don't fit on screens. Same time, they want it printed out on one page. Readable. Now THATS a problem. If I load the product onto a screen (not a single row), I need all properties. The difference in loading one row with ALL the data ONCE or selectively loading parts of it (TableA, TableB etc.) is close to zero. And still, I could still select part of ONE table 😉 I don't have to select * from it.
4. Your last point is silly (sorry to say). IF I want indexes on ALL columns, I would need the same if I'd split the table. If I want to search for all properties (fast), I would still use one table: With 3 rows: ProductID, PropertyID, Value. Then I in fact need three indices and that's it. But still: ONE table 😉
Again, it's a product table: Which company has more than say 5000 products in their portfolio? There is nothing bad in scanning a couple of thousand records in the rare case somebody searches a product by properties... If this would become a daily task, I would might consider changing the design.
Again: The theory is nice, the practice bitter.
I think a good (DB-)programmer should be aware of when to ignore rules. A bad programmer does not even know rules.
If I'd experience drastic performance issues when loading one huge table, I'd definetly split it until I get it right.
I usually design the DB to be as elegant as possible.
Then implement everything.
Then fill in a lot of test data. Really a lot!
Then test with the performance and find the bottlenecks.
Then create some redundant lookup tables to speed up daily tasks.
Viewing 11 posts - 31 through 40 (of 40 total)
You must be logged in to reply to this topic. Login to reply