May 17, 2010 at 2:01 pm
The application that I am going to build is going to have hundreds of static data tables. Since virtually all of these tables are going to have the same fields I am considering creating one table that has a field called LookupType to distinquish between the entries in the table. Before I get started I was wondering if there are any performance concerns since the data in this table will be selected quite a lot. I should also explain that 99.9% of the time the code that accesses this table will only do selects.
Thanks
May 17, 2010 at 2:39 pm
This is a widely debated topic. Here's a good link to reference. There's a link within that thread to a good article on the topic, but the thread dialog itself is worth a read.
I personnaly prefer to create seperate tables for the same reasons that Grant mentions in the thread.
May 17, 2010 at 3:39 pm
I have never understood the attraction of having one table with many different types of entities.
As mentioned, it is a violation of rules of normalization, so that is a sign immediately that you will be having trouble.
It makes it very difficult to implement referential integrity, since all references would have to be to a two column key, and require you to have both in the child table with a constraint on the type.
All queries would have to have knowledge of the type column, so it doesn’t really save time in coding, and usually makes it more complicated.
You will get bad query plans, since the lookups will be against a large table with many different types of entities, instead of tables with a small number of rows.
Basically, you would be throwing away many of the advantages of the relational database for a false savings in table design time that you will lose many times over in query complication and poor performance.
May 18, 2010 at 7:28 am
Michael Valentine Jones (5/17/2010)
I have never understood the attraction of having one table with many different types of entities.As mentioned, it is a violation of rules of normalization, so that is a sign immediately that you will be having trouble.
It makes it very difficult to implement referential integrity, since all references would have to be to a two column key, and require you to have both in the child table with a constraint on the type.
All queries would have to have knowledge of the type column, so it doesn’t really save time in coding, and usually makes it more complicated.
You will get bad query plans, since the lookups will be against a large table with many different types of entities, instead of tables with a small number of rows.
Basically, you would be throwing away many of the advantages of the relational database for a false savings in table design time that you will lose many times over in query complication and poor performance.
I want to thank both you and John for the information you provided me. It does give me quite a bit to think about. Before I continue I thought I would post a partial schema so I could ask a few questions:
CREATE TABLE dbo.Lookups
(
ID int IDENTITY,
LookupType varchar(50) NOT NULL,
Description varchar(100) NOT NULL,
SortOrder int NOT NULL,
)
I have limited experience with this, so I guess I also have limited vision as to the problems. I am not sure that I understand the 'two column' problem of referential integrity as you stated it. If the ID field serves as the Foreign Key in other tables, why do I lose referential integrity?
Am I correct that many processes reading from one table does slow things down more then these processes hitting many tables?
Thanks
May 18, 2010 at 9:53 am
meichner (5/18/2010)
...I have limited experience with this, so I guess I also have limited vision as to the problems. I am not sure that I understand the 'two column' problem of referential integrity as you stated it. If the ID field serves as the Foreign Key in other tables, why do I lose referential integrity?...
The referential integrity problem is that without references to both columns, you would have no way to ensure that a table is referencing a row of the correct type. So you might have a foreign key for something like order type that was referencing a column for something that is meant to be colors, product type, store type, or other completely unrelated things. So you can end up with a order type of "Green" or "Red" or "Kilograms" or anything else in the lookup table.
Do yourself a favor and forget doing a One True Lookup Table design. It will only cause you trouble and it will never really save you any work. There is absolutely nothing wrong with having hundreds of lookup tables.
May 18, 2010 at 10:20 am
We had some folks that were pushing pretty hard for this type of design at a company that I used to work for. I had the final design say-so so we went with multiple lookup tables, one for each type. One of their arguements for the single lookup table was that they did think they should have to go to multiple places to view the different lookup data. So I created a view that joined together all of the lookup tables. The view was used by the analysts and business folks to look at the lookup values accross the system, but the physical table design incorporated multiple lookup tables to ensure integrity and normalization.
The view unioned the lookup tables together and hard-coded the lookup type column like this:
SELECT 'Type 1' as LookupType,
CAST(LookupValue as varchar)
FROMLookupTable1
UNION ALL
SELECT 'Type 2',
CAST(LookupValue as varchar)
FROMLookupTable2
UNION ALL
SELECT 'Type 3',
CAST(LookupValue as varchar)
FROMLookupTable3
UNION ALL
SELECT 'Type 4',
CAST(LookupValue as varchar)
FROMLookupTable4
May 18, 2010 at 11:04 am
Michael Valentine Jones (5/18/2010)
meichner (5/18/2010)
...I have limited experience with this, so I guess I also have limited vision as to the problems. I am not sure that I understand the 'two column' problem of referential integrity as you stated it. If the ID field serves as the Foreign Key in other tables, why do I lose referential integrity?...
The referential integrity problem is that without references to both columns, you would have no way to ensure that a table is referencing a row of the correct type. So you might have a foreign key for something like order type that was referencing a column for something that is meant to be colors, product type, store type, or other completely unrelated things. So you can end up with a order type of "Green" or "Red" or "Kilograms" or anything else in the lookup table.
Do yourself a favor and forget doing a One True Lookup Table design. It will only cause you trouble and it will never really save you any work. There is absolutely nothing wrong with having hundreds of lookup tables.
I have to say, that's a pretty good point.
May 18, 2010 at 11:07 am
John Rowan (5/18/2010)
We had some folks that were pushing pretty hard for this type of design at a company that I used to work for. I had the final design say-so so we went with multiple lookup tables, one for each type. One of their arguements for the single lookup table was that they did think they should have to go to multiple places to view the different lookup data. So I created a view that joined together all of the lookup tables. The view was used by the analysts and business folks to look at the lookup values accross the system, but the physical table design incorporated multiple lookup tables to ensure integrity and normalization.The view unioned the lookup tables together and hard-coded the lookup type column like this:
SELECT 'Type 1' as LookupType,
CAST(LookupValue as varchar)
FROMLookupTable1
UNION ALL
SELECT 'Type 2',
CAST(LookupValue as varchar)
FROMLookupTable2
UNION ALL
SELECT 'Type 3',
CAST(LookupValue as varchar)
FROMLookupTable3
UNION ALL
SELECT 'Type 4',
CAST(LookupValue as varchar)
FROMLookupTable4
I want to thank you and Michael for the advice. It looks like I could save myself some work in the short term and possibly create a headache for myself down the line.
Thanks
May 19, 2010 at 5:07 am
John Rowan (5/18/2010)
We had some folks that were pushing pretty hard for this type of design at a company that I used to work for . . .
That view is an excellent idea. :smooooth: I will consider having one in every database design from now on. Thanks.
May 19, 2010 at 6:36 am
John Rowan (5/18/2010)
So I created a view that joined together all of the lookup tables. The view was used by the analysts and business folks to look at the lookup values accross the system, but the physical table design incorporated multiple lookup tables to ensure integrity and normalization.
... not to mention to minimize contention at table level. Nice solution, model it right, expose it as users want to see it 😉 Brilliant!
_____________________________________
Pablo (Paul) Berzukov
Author of Understanding Database Administration available at Amazon and other bookstores.
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.May 19, 2010 at 8:24 am
Another good reason for not using the One True Lookup Table is the lack of flexibility. Your requirements right now mean that all those different lookup types have the same table structure, but what happens if requirements change and you have to add extra columns to just one of those lookup types? Things can get very chaotic, very quickly.
May 19, 2010 at 9:01 am
paul.knibbs (5/19/2010)
Another good reason for not using the One True Lookup Table is the lack of flexibility. Your requirements right now mean that all those different lookup types have the same table structure, but what happens if requirements change and you have to add extra columns to just one of those lookup types? Things can get very chaotic, very quickly.
Yes, and there are many other potential design problems. For example, you might have a need to have one type reference another type, or might have a need for two types to be joined in another table to give a list of valid combinations of types.
May 19, 2010 at 11:12 am
Thanks Nice and Paul. Creating that view for them really gave me some flexibility for future design decisions. I was at a company where the developers and analysts had a huge say in database design and the DB ended up a mess. Showing them this design technique helped me get buy-in from the team in my phsical model and was an important step in gaining the trust I needed to do my work without being second guessed at every turn.
June 25, 2010 at 5:49 pm
we are facing a similar issue and will definitely be using separate tables, but we have several applications each with separate databases. data is shared between applications via web services that use the Id values of these common lookup tables in the contracts. we absolutely need RI constraints, which kills the ideal solution which would be to place all tables in a central database and add views to each application database with the same name of the lookup tables. All databases, at least for now, will be on the same sql server instance. Triggers are out, as is trying to enforce RI via the Business layer of applications or repository tiers. The best we have been able to come up with is using single master transactional replication to transfer the specified subset of common tables needed by each app to it's database. This allows us to do backups and restores of the database at a logically consistent point in time, rather than have to worry about restoring multiple databases to a single point in time. I don't like this solution but is the best we have been able to come up with. Consolidating all databases into one is not an option for a variety of reasons. These tables will be fairly small and change rarely. Any suggestions?
June 28, 2010 at 10:56 am
Replication is usually the best option for maintaining multiple copies of reference data accross databases. Your only other option would be to design your own replication functionality. You could design a method to 'publish' reference data from your master database and use Service Broker to move that data accross to the subscriber DBs via XML messages. This would allow you to control when the data is sent accross, but it would require you to design an entire process and underlying DB objects such as tables, remote SB objects, etc. This is doable, but using replication would be so much easier.
Viewing 15 posts - 1 through 14 (of 14 total)
You must be logged in to reply to this topic. Login to reply