April 24, 2007 at 8:45 am
OK, so besides being unsightly is there any real problem with a table that has 422 columns?
we're starting a normalization project at work here, we have an exisiting sql2005 database that was originally access-->2000-->2005. It's kind of a mess.
This particular table is holding all floats, except for its pk/fk. It is a relatively well used table for lookup by COLUMN name.
Is it worth breaking it up into 8(?) smaller tables, if nothing else it would fit on a single database diagram.
I've had large tables that I've designed where, with maybe 40-50columns I would split into 2 and tie with a pk/fk reference.
Just wondering if it's worth the effort, since this table isn't alone for its size.
Thanks,
Chris
April 24, 2007 at 9:01 am
As a consultant would say: It depends. If some columns are accessed or updated more frequently than others, it may make sense to vertically partition the table. It really comes down to a design/business decision.
April 24, 2007 at 10:58 am
Normalization rules would dictate that if every value in that column is a specific and different attribute of the entity then you should keep them there. So long as you're in 3NF, given the information you've provided, I don't see a good reason other than managability to split it out. I'd generally say think more about row size in bytes but if it's all float (8 or 4 bytes) you should still be within your requirements.
You might look at enforcing more specific data types than float, however. Floating-point data is an approximate and isn't best used when the data does have an exact value (i.e. I've seen float used to store monetary value... eek!).
April 24, 2007 at 12:47 pm
I agree with Aaron if you are going to keep your database in 3NF. Again, however, I'd also still say "It depends". Yes you can keep the database in 3NF, but sometimes you also have to make design decisions that violate normalization rules. This requires additional work, and needs to be balanced against what you are doing.
This is why I brought up about some data being queried or updated more often the other data elements. If you have 80 columns, and 50 % of all updates affect the first ten columns and none of the other 70 columns, then vertical partitioning may make sense to do. There is a design decision/trade off that needs to be made.
April 24, 2007 at 11:38 pm
How is the table being used? By what application?
If you are generating a report from that table, you do not want the report to spend crucial minutes in making joins (depends on number of tables, volume of data , indexes , ...) Take a good look at the queries that hit this table.
April 25, 2007 at 11:02 pm
In my opnion
1/ (almost) any situation where you find extensive use of NULLS to allow data to placed in a multi-column table is a clear sign that normalisation is required. NULL should be used to represent a valid data item for which we do not know the value as opposed to a flag indicating that that we do not require this data.
2/ Any situation that leads to inefficient use of your Buffer Cache and/or IO is a candidate for normalisation. This occurs when you force the RDBMS to load a large amount of unused data in order to obtain the few atomic items you actually require. If you call 400 odd columns of data across the network and then use only 1 or 2 you are obviously both moving unecessary data (unnecessary IO) and filling your Cache with data that is not going to be used. In this repect multiple smaller tables can often be more efficient
If you are not familar with DB design theroy I would suggest that you google "deletion anomaly", "insertion anomaly" and "Update Anomaly" These are the usual results of a lack of normalisation (plus in my opinon the IO result noted in item 2 above)
Regards
Karl
April 26, 2007 at 6:24 am
Thanks for the input, we still need to do alot of evaluation before we touch anything. I'm still trying to get up to speed on the database layout (into my 3rd week at the company), so it might be a little bit before we start ripping things apart, if at all.
3NF is going to be alot of work I think given what I've seen so far, but its gonna happen.
Chris
April 27, 2007 at 11:16 am
I'd like to reinforce what was said about columns containing NULLS. If you have columns that are more often NULL and you can find group of columns that are contistently NULL then I would say they are a candidate for pulling out of the base table.
Steve
April 27, 2007 at 3:40 pm
Ditto on the nulls...
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply