September 23, 2005 at 9:37 am
I think we’re veering off the highway and heading for a swamp, but I can’t explain how or why. Help!
There’s this “UserPreference” table, along the line of:
TABLE User (UserID, UserName)
TABLE UserPrefernce (UserID, Col1, Col2, Col3, …Col121)
…where the various “Cols” have clear and precise names used in one or another corner of the application—stuff like website layout, reporting options, client URLs, welcome messages, client operational defaults. As they are columns in the table, these values get set for every user, though some users might never use them (maybe they don’t have access to that web page or that report).
Over time, more and more columns have been added to this table, making it more and more unwieldy. This happens because, as new applications and functionality is developed, new user-settable default and/or configuration values are generated, and they have to be stored somewhere… so why not add a few more rows to UserPreferences? (The table could maybe have been split into multiple tables, such as WebPreferences, ReportPreferences, etc., but this has not been done.)
To manage this, the following table design has been proposed:
TABLE Preference (PreferenceID, Name, Description, DataType)
TABLE UserPreference_new (UserID, PreferenceID, Value)
…so you’d add preference “GreetingMessage” to Preference, and for each user who might require it you enter the appropriate UserPreference for that user in UserPreference_new. The “Value” is a varchar, and “DataType” really just a reference to what ought to be in there (since “dynamic data type domain integrity” cannot be enforced). This would change us from a “wide shallow” table to a “thin deep” one.
The thing of it is, we did this at the last job I was at. It started out cool and useful, and ended up one a tremendous pain in the a** … because people (not just the developers, but modelers and designers and even requirements writers who’d heard about it) would toss most any old thing in there, and some things (state values, key values, even address information) just doesn’t belong in such a table.
My problem is that I can’t explain this clearly, concisely, and (most importantly) convincingly. I know that in many situations this is a really bad idea, but I can’t explain why well enough to convince people. It also seems like it is a valid situation in some situations for some data, but I cannot clearly define to my own satisfaction what those circumstances would be.
Most importantly, this has to be a wheel—a classic database design issue that’s come up time and time again… and I don’t know the working name or title for this, and so can’t readily look anything up on the subject.
Any links to prior discussions or references to erudite database theory would be greatly appreciated!
Philip
September 26, 2005 at 8:00 am
This was removed by the editor as SPAM
September 28, 2005 at 6:16 pm
If your developers, designers, and requirements weenies are allowed to "toss any old thing" into your database, you don't have technical problems, you have people problems. Why do they care in which table it gets stashed, as long as they can read or update it efficiently when necessary? I've done the name-value pair technique as well, right up until SQL Server decided that 300 connections hitting this one table simultaneously for every web page refresh was a lot of work and slowed the whole web site down in protest. This generated lots of "donated" overtime for moi, and lots of stomach acid. If I were you, I would explain to them how indexes work, how indexing the PreferenceID column will only get you so far, how by putting all the values in a generic "Value" column will lose the value of strong datatyping, and how things seem to be performing fine right now, but you can't guarantee how long that will continue and when it does go south, it will be a very dark day filled with much gnashing of teeth and rending of garments and the cries of the afflicted will fill the valley. Or words to that effect. Throw in mention of "violation of Third Normal Form" (don't forget to capitalize) and referential disambiguation, and they should be putty in your hands.
September 28, 2005 at 10:39 pm
Sound advice, and I'll give it the try. The uphill curve is that I'm new at this company and thus the "other" DBA (of two), and my opinion doesn't carry all that much weight yet. I've hit the "name-value pair" [good title!] conundrum before and know the pain it can induce, but that'e experience that I'm having trouble clearly communicating.
I believe that they're reluctant to add further rows to the table because of the implementation--Java, using Hibernate as the data access layer. (I'm still a bit vague on how this tool works, but to-date I'm not overly impressed.)
And it's not so much that we have people problems as it is lack of (DBA) control over database design. If we open a door in the database like this, I just know that eventually that cat will drag in something dreadful.
Philip
September 28, 2005 at 10:41 pm
Sound advice, and I'll give it the try. The uphill curve is that I'm new at this company and thus the "other" DBA (of two), and my opinion doesn't carry all that much weight yet. I've hit the "name-value pair" [good title!] conundrum before and know the pain it can induce, but that'e experience that I'm having trouble clearly communicating.
I believe that they're reluctant to add further rows to the table because of the implementation--Java, using Hibernate as the data access layer. (I'm still a bit vague on how this tool works, but to-date I'm not overly impressed.)
And it's not so much that we have people problems as it is lack of (DBA) control over database design. If we open a door in the database like this, I just know that eventually that cat will drag in something dreadful.
Philip
September 28, 2005 at 10:43 pm
Grr. Rural satellite wireless connection = double post.
September 29, 2005 at 10:29 am
Don't do it! This is a train wreck waiting to happen. This is essentially the same thing that I refered to as a MUCK table in this article:
http://www.sqlservercentral.com/columnists/dpeterson/lookuptablemadness.asp
Also see the discussion of this article. Ignore the article itself, look at the discussion...
http://www.sqlservercentral.com/columnists/rnarasimhan/databasedesignandreferencetables.asp
This is a bad idea in EVERY situation.
/*****************
If most people are not willing to see the difficulty, this is mainly because, consciously or unconsciously, they assume that it will be they who will settle these questions for the others, and because they are convinced of their own capacity to do this. -Friedrich August von Hayek
*****************/
September 29, 2005 at 11:27 pm
Phillip: I feel your pain. I, too, am the 2nd banana in a two DBA shop. And I really believe that you have a people problem if they won't let you manage the database as you see fit. Unfortunately, there is probably not much you can do about it. You just have to "stay on message" as they say in the White House until they either see the value of having a pro look after their bytes, or learn to trust you. Good luck.
October 1, 2005 at 4:55 pm
Good stuff here! Thanks to everyone that has posted! I have a question along these lines:
What about a company that sells lots of different products such as shoes, food, cars, dresses. Should there be a product table for each product type or should there be a wide product table to accommodate all the different properties these products will have. I can see now that the “name-value pair” design would be bad – based on the posts here. It was a compelling way to go. How about a central product table that feeds out to custom product tables?
Here is another problem with the same issues: An aircraft manufacture wants to put all of the parts that go into the airplane into a Parts table, but the properties of the parts very widely. Should each part-group get its own table? This is very similar to the first example.
I am guessing there is a “best practices” here and I would love to here it! Thanks for your time!
October 1, 2005 at 10:13 pm
A model that should work in these cases is that of type-subtype. A key component is that there is a set of properties shared by all the items you are tracking, but the items fall into groups of subtypes with their own distinct properties. You have a "main type" table, and one "sub type" table for each type that has properties unique enough to not belong in the main table.
The textbook example is frequently vehicles. The main table ("cars") contains properties pertinent to all vehicles (engine type, mileage, fuel type, etc.), and you add subtypes for each type you're tracking (cars have numbe of doors, motorcyles don't, trucks could have maximum cargo space/weight, and so on.)
Design-wise, the main table has a primary key, and each subtype table uses that key as both its own primary key and a foreign key to the main table. No primary key value appears in more than one sub-type table. This, alas, cannot be forced at the relational integrity level, and so must be enforced either with code (triggers, stored procedures, application logic) or wishful thinking.
The above is a very brief intro based on theory, not experience--I've never implemented a proper type/subtype model. I have, however, dreamed of them, as implementing such logical constructs as "independant" tables rather than type/subtypes can lead to very painful code down the road. Definitely a subject to read up on in books on database design. A suggestion: start a new thread here on this subject and see what people with practical experience advise.
Philip
October 3, 2005 at 9:24 am
Philip
Thanks for the quick reply. I have seen your solution in OO programming before. I don’t know why I didn’t think to apply that to databases. I am sure that this method is the way to go now that I have thought about it.
Thanks for your time
October 19, 2005 at 11:55 am
I'm almost done with a system that does exactly what you are talking about. So far it all works great. I am having only one problem that I have not been able to solve which revolves around how to work with parts like a Tire that can be placed front or rear.
How to deal with this when a tire is just that a Tire. A tire has no front or rear attribute to it since you can generally place a tire on the front or rear of a vehicle. You will only run into this problem when you want to keep track of where a customer wants to have the tire placed on their vehicle. Say someone wants to place one brand of tire (or size/type/etc.) on the front and another on the rear.
So when you have a Tire table that has the tires attributes and you then have something like an Assembly table that stores the parts selected how do you know which of two tires in the Assembly table go where on the build vehicle?
Anyone have any ideas/thoughts on this?
October 19, 2005 at 7:41 pm
I would have two fields:
Front_Tire
Rear_Tire
Both bit type. Both in your Tire table and in your Assembly table. If the tire is rear only then it gets a 1 in the Rear_Tire field and a 0 in the Front_Tire field. Reverse for a front only tire. If it goes in both it gets a 1 in both. Same for the Assembly table.
October 20, 2005 at 12:28 pm
I'll tell you a little more about what I did. Instead of adding part type specific fields into the assembly table which is a generic part store, meaning it holds all types of parts together; I added a single field called FrontOrRear. This FrontOrRear field can have the following values ('F', 'R', 'S', 'B'). The values are fairly straight forward in that they stand for (Front, Rear, Set, Both). So when parts are inserted into the Assembly table, I called it a BuildItem table because Assembly conflicts with the .net framework namespace/classes, they get inserted with the value appropriate for where the part is to be used in the physical object.
This solved the first problem that I ran into. The second part of this problem is how to deal with some of the different part types that I have. Say for example I have a Brake, well a brake is a part that has in my case identical attributes front and rear, which is why I didn’t create FrontBrake and RearBrake tables, but its shape and size are very different between the front and rear. In my system I have a Brake and the attribute FrontOrRear with values that are always either (‘F’, ‘R’, ‘S’). There aren’t any brakes that would have a value (‘B’) because there isn’t a case where a single physical brake could be used in either the front or the rear. This part is easy to deal with because whenever the brake gets inserted into the BuildItem table the correct value for BuildItem.FrontOrRear can always come directly from Brake.FrontOrRear.
The other part Tire is a little more complicated as a tire more often has the FrontOrRear values of (‘B’, ‘F’, ‘R’), (‘B’) being the most common. So when the value is most always going to be (‘B’) we can’t use Tire.FrontOrRear for BuildItem.FrontOrRear because we won’t know if the inserted tire is meant for the front or rear of the finished physical object. What we have to do is somehow know from which DropDownList the inserted tire came from and pass the value of (‘F’) or (‘R’) into the BuildItem.FrontOrRear column. This is where I am now with the problem.
I feel like I’m ranting a bit and I’ll stop here and let anyone else add their input.
October 20, 2005 at 12:39 pm
It sounds like you need to differentiate between your parts table which contains the attributes of the parts and a work table which indicates which parts are to be installed (and where) for a given work order etc...
/*****************
If most people are not willing to see the difficulty, this is mainly because, consciously or unconsciously, they assume that it will be they who will settle these questions for the others, and because they are convinced of their own capacity to do this. -Friedrich August von Hayek
*****************/
Viewing 15 posts - 1 through 15 (of 15 total)
You must be logged in to reply to this topic. Login to reply