July 26, 2011 at 10:20 am
This may ignite a firestorm of passionately held beliefs, but I'd still welcome hearing some opinions.
In the case of a table with multiple foreign keys into auxilliary, lookup tables, are there significant advantages or disadvantages in shovelling everything into one big table? The lookup tables all contain different subject matter, genus, species, stratigraphic unit names, country names and so on. The contents are not related, but they all contain roughly the same sort of data - text of one or several words (usually only one word), never more than a few dozen characters, not linked to anything else.
I now have main records like:
Primary key (int), text1, text2, text3, .... then FKey1 (int), Fkey2 (int), Fkey3 (int), ...
where each FKey is the autonumber ID from one of the lookup tables. If I put all that stuff together into one table, added a field specifying what kind of data it is (the equivalent of the former separate lookup table names) and used filtered and indexed views into this one large table, I would at least have a simpler structure in the database (one table instead of twenty).
What other impact would such a design change have (besides a lot of work to change over)?
July 26, 2011 at 10:28 am
Are we talking about a Transactional Database or a Database setup for reporting purposes (data warehouse / OLAP)
If it's the former then follow the steps of normalisation through to at least 3NF and you will have a decent design. There are many benefits for doing this and they have been proven time and time again.
If its a data warehouse for reporting, then things change slightly and it is perfectably acceptable to lump the data into one big table (within reason of course).
July 26, 2011 at 10:53 am
steveb. (7/26/2011)
Are we talking about a Transactional Database or a Database setup for reporting purposes (data warehouse / OLAP)
I'm not sure which category this fits. I've read up on OLAP cubes repeatedly and still don't really understand them. But this DB is nothing very complicated. It's a fairly simple inventory system. An item (a rock, it's for a paleontological museum collection) has a unique ID number and a bunch of properties - what kind of critter it contains, where it was found, who found it, where it is stored and so on. These are the lookup tables. It's a pretty lightweight application - less than a dozen total users and rarely more than one or two at a time. Response time for searches is much more important than time necessary for insertion of new records. Does that help?
If it's the former then follow the steps of normalisation through to at least 3NF and you will have a decent design. There are many benefits for doing this and they have been proven time and time again.
I do have it normalized to that point and it works well. I'm continually twiddling things as I become better at SQL Server, to improve response time and add new features that users request, but I'm also looking at design changes that would make things cleaner overall, with an eye towards the future. This entire system is fairly successful, in that it's been in use for over a decade in various incarnations, and there is no indication that it will become obsolete any time soon. Quite the opposite - it's constantly growing, and people are constantly asking about new ways to use it and requesting new features.
If its a data warehouse for reporting, then things change slightly and it is perfectably acceptable to lump the data into one big table (within reason of course).
I guess data warehouse is probably a fairly good description, but I don't want to throw everything into one big table - that would be a huge mess. In fact, that's how this entire project started off, as one Excel table and one Access table. Much of my work on it has been to merge the original 'everything' tables and convert the result into a properly structured design. All I'm wondering about right now is whether merging the lookup tables into one is a good idea or not.
July 26, 2011 at 10:57 am
That kind of structure will work but it gets nasty. As you said there are as many as 70 groups of data. Imagine the query when you need to get 20 of those....
Select genus.description as Genus, species.description as Species, mutation.description as Mutation, family.description as Family, FirstName
from People
join mstrxref genus on genus.ID = People.GenusID and genus.type = 'Genus'
join mstrxref species on species.ID = People.SpeciedID and species.type = 'Species'
...
You can see how confusing and difficult it can become very quickly. This is spoken directly from experience. I built a system with a mstrxref table like the one you are doing exactly once. There is nothing actually inherently wrong with it but it is not fun to maintain/debug.
The worst thing is that you are forced to alias every single column and the list of tables joined is super fugly!!!
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
July 26, 2011 at 11:39 am
Sean Lange (7/26/2011)
That kind of structure will work but it gets nasty. As you said there are as many as 70 groups of data. Imagine the query when you need to get 20 of those....
Select genus.description as Genus, species.description as Species, mutation.description as Mutation, family.description as Family, FirstName
from People
join mstrxref genus on genus.ID = People.GenusID and genus.type = 'Genus'
join mstrxref species on species.ID = People.SpeciedID and species.type = 'Species'
...
You can see how confusing and difficult it can become very quickly. This is spoken directly from experience. I built a system with a mstrxref table like the one you are doing exactly once. There is nothing actually inherently wrong with it but it is not fun to maintain/debug.
The worst thing is that you are forced to alias every single column and the list of tables joined is super fugly!!!
Well, that's the kind of joins I have now, to each of the tables. The aliasing problem would be handled by using the filtered/indexed views in the join, wouldn't it?
I thought that at least I might save myself some hassle in the future, since instead of a new table, I could just add a new value to the definition field. Although, now that I think about it, adding a new view is not that much simpler than just adding a new table, and your experiencing such hassles with it are not a good recommendation.
My current design works - I'm just a perpetual tinkerer and always looking for ways to do something better, but in this case, maybe I should leave well enough alone.
July 26, 2011 at 11:49 am
The other thing that can kind of ugly is when you no longer have just varchar data. You suddenly start being forced to cast/convert stuff. By all means don't just take my word for it but in my opinion it is a can of worms I don't want to open again. 😀
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
July 26, 2011 at 12:57 pm
One table for an Inventory Database? If you attempt it you will regret it.
http://www.informit.com/articles/article.aspx?p=30646
For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
July 26, 2011 at 1:08 pm
I've seen lots of databases that use this as a solution to simplify things. It works great for a while, then as more and more categories get added, it gets slower and slower, more and more complex to deal with, and eventually has to be broken out into separate tables anyway.
If you're certain the data will never grow significantly, and that no other developer will ever have to figure out what the heck you've done here, and that you'll never have a significant number of concurrent users, this kind of solution can be made to work.
If you aren't 100% positive on any of those factors, then keep it split out. After all, less tables in the database isn't actually a benefit, it just looks tidier.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
July 26, 2011 at 5:45 pm
Ok, thanks everyone for the input. Guess I'll leave it the way it is.
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply