February 1, 2011 at 4:47 pm
I have had constant issues with my boss on proper database design. He is constantly saying that everything belongs in one table. That way apparently he can just look into one place and find everything. He does not like using joins. I am trying to battle this situation and cannot seem to present my case in a logic way that strictly states his way is wrong ha. I'm pretty sure its wrong anyway, and all of my co-workers who work with the database and dataset agree.
Here is the main case that I would like some help with.
We have one table called [Readings] it stores readings from equipment measuring things out in the field like air-speed, polution levels for various contaminants. And each of these readings has a different method which can measure associated properties. So the logic of table structure is like this:
Measured_Entity_Table > Method_For_Measurement_Table > Results_Table
Here every table is related 1 to many going left to right. At the moment, we have a bunch of generic columns in the Results_Table called P1, P2, P3... etc. These can hold various floats, ints, and nvarchars. Here is some examples of setup data
Measured_Entity_Table
1 Dust
2 Toxins
Method_For_Measurement_Table
1 Dust_MeasuredByFilter
2 Dust_MeasuredBySuperMeter
3 Toxins_MeasuredBySuperMeter
The issue is when you can have two separate entities measured using similar methods (ie using the super meter which gives you a toxin and dust reading). These are inserted as 2 separate rows into the results_table and have additional data that is associated with both of the readings. Sometimes this additional data is too much to fit into the 'P' value fields of the readings table, so our bosses solution is to make another method in the Method_For_Measurement_Table called SuperMeterAdditional and then to fill the remaining additional data into that record. This kind of breaks the logic as to what gets held together as a specific sample.
What is best practises to solve this problem?
Would you not using something like a linking table that links readings together as a sample and then stores other associated data in a more specific table to that type of sample? The reason we don't do this is because it makes the database less generic and more specific to just that one method sample.
On another note, doing it this way means that several different readings can be linked to one and only 1 reading-taken-date. Hence when a user uses the front end web application attached to this database they can update just one row and one date per sample. If the date is stored in the readings table it becomes duplicate data and there is the potential to update one date and not the corresponding dates, making the sample corrupt.
This all stems from making the database as generic as possible and using a front end web application which mainly just looks directly into the results table and allows the user to update pretty much any column in that table. It has caused tons of nighmares and I am looking for a solution some where.. anything really, advice, your design flaws you have learned from, anything to help.
If you wish I can write some sample data and table structures to better describe the issues?
Thanks
February 1, 2011 at 5:43 pm
loki1049 (2/1/2011)
I have had constant issues with my boss on proper database design. He is constantly saying that everything belongs in one table. That way apparently he can just look into one place and find everything. He does not like using joins. I am trying to battle this situation and cannot seem to present my case in a logic way that strictly states his way is wrong ha. I'm pretty sure its wrong anyway, and all of my co-workers who work with the database and dataset agree.
It usually doesn't follow standard design techniques to do it that way, but there's other considerations. In the end your boss signs your checks, so remember to keep things in perspective about what is worth angering your boss over. If he designs non-standard techniques are his best practice, there's really not a lot you can do about it but make your case, and then smile and nod while implementing his way.
So, moving along:
Here every table is related 1 to many going left to right. At the moment, we have a bunch of generic columns in the Results_Table called P1, P2, P3... etc. These can hold various floats, ints, and nvarchars. Here is some examples of setup data
The idea of generic columns and overloaded columns are known problems, especially when dealing with reporting. Just look at the issues of using a WHERE ISNUMERIC() as an example of having problems during conversion of data types. SQL doesn't guarantee any order, and this is one item you can bring to his attention.
The issue is when you can have two separate entities measured using similar methods (ie using the super meter which gives you a toxin and dust reading). These are inserted as 2 separate rows into the results_table and have additional data that is associated with both of the readings. Sometimes this additional data is too much to fit into the 'P' value fields of the readings table, so our bosses solution is to make another method in the Method_For_Measurement_Table called SuperMeterAdditional and then to fill the remaining additional data into that record. This kind of breaks the logic as to what gets held together as a specific sample.
What is best practises to solve this problem?
One of the things this can cause is bad data. Let's say in your meter example, P5 is the 'volcanic dust' level. On the _additional row, though, P5 is the # of workers on the site. Besides making this hell to keep track of from the database level, if you try to aggregate the data for this you have made optimization more difficult because of another layer of complexity in restricting the rows allowed in the group by.
Would you not using something like a linking table that links readings together as a sample and then stores other associated data in a more specific table to that type of sample? The reason we don't do this is because it makes the database less generic and more specific to just that one method sample.
Your last offer to show solid sample data and DDL would be more useful here. Not seeing the actual data/structure, it makes it harder to offer sweeping discussions, agreeing or disagreeing, with the current process. All we can offer are generalities here.
On another note, doing it this way means that several different readings can be linked to one and only 1 reading-taken-date.
This makes sense to me...
Hence when a user uses the front end web application attached to this database they can update just one row and one date per sample.
... as does this...
If the date is stored in the readings table it becomes duplicate data and there is the potential to update one date and not the corresponding dates, making the sample corrupt.
... Then this should be proc wrapped, and not dynamic sql or cursor based front end accessed.
This all stems from making the database as generic as possible and using a front end web application which mainly just looks directly into the results table and allows the user to update pretty much any column in that table. It has caused tons of nighmares and I am looking for a solution some where.. anything really, advice, your design flaws you have learned from, anything to help.
If you wish I can write some sample data and table structures to better describe the issues?
Please do, we can offer more specific advise that can both fit your needs, and possibly assist you in keeping your boss content, with more knowledge. As it stands, there's not a lot we can specifically offer except generics.
Mostly my concern here are the overloaded fields. P1 being varchar for this, float for that, kind of issues. The rest, well, I can't say I disagree with him offhand. Sometimes a generic-ness to data volume DOES ease querying and coding. It allows for new structures (tests, in your case) to be included without major IT work. It does need some standardization to how it's layed out though.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
February 1, 2011 at 5:50 pm
I Tend to agree with Craig, and I'd like to see more information about what the entities are, what you are tracking, and what it means in relation to other entities. That way I can offer better advice on design.
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply