Table Redesign Question

Question

Table Redesign Question

amy26

SSCrazy Eights

Points: 8926
More actions
July 23, 2018 at 1:08 pm

#407235

Hey there!
We have a series of tables that I'm trying to make more efficient.
Currently there are over 90 "request" tables. These request tables represent a type of "change record" for various modules in our system. There are "add" requests, "delete" requests and "change" requests. Each module could have 3 or more tables and there are like 20 modules. Some of the data fields are the same and some are not. There are also some data elements that have a one to many relationship and those are stored in a separate table(s).
When we do system wide queries... we have to have views and queries that join together like all these 90 some tables to get a consolidated view. I know I can't possibly get them all into 1 table but I would like to make them more manageable.
So, my question is ... what would be the "better" practice... would you opt for having 1 table for example that has TONS of columns that may have a lot of NULL values in the fields where that field doesn't apply to the particular request OR would you create a "field value" table that has like a BLOB column with the actual input value, another column with the "name" of the field and then some sort of primary key? I don't particularly like this idea as it negates data types and validation rules just having a bunch of various values in a column. Or, third option, just keep all the separate tables?

Viewing 6 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply

sgmunson SSC Guru Points: 110651 More actions · Answer 1

With 90 tables to figure this out for, it's not realistic to think that we could easily determine the best methodology. However, one option I can suggest strongly that you avoid is the one you already don't like - mashing them all into one. It's a bad idea on steroids. However, you may be able to create a single table that handles a particular data type, and each row could also contain a table name, which indicates exactly which table that data is to be used with. I'd probably leave the multi-table setups alone, at least at first, and then look at those later. You might end up with somewhere between 5 and 10 new tables, each handling a specific data type, and each with a table_name column to indicate which table a given value applies to. Much more than that would require considerably more information than is realistic to place in a public forum.

Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)

amy26 SSCrazy Eights Points: 8926 More actions · Answer 2

Haha yes, totally understood which is why I didn't post any scripts or anything. Just looking for a high level functional type of approach. I am the sole voice of reason to a bunch of people that still sadly think a woman's opinion isn't the best one. So, I just wanted to make sure my thought process was on point before I tell them to pound sand about that BLOB table idea. Blech.

So, I'll just consolidate the tables where I can and still keep the one to many ones and also try not to make one table with tons of fields that may end up with NULL values... sometimes I just need to throw around ideas with other people that know what I'm talking about. 🙂

ScottPletcher SSC Guru Points: 101238 More actions · Answer 3

Not enough details to get specific, but in very general terms, I would likely go with a new option.

Keep the existing tables to do the current logging, but have processes that periodically pull unconsolidated data from those tables and put into consolidated tables. The sql_variant data type is a huge help here. You don't have to resort to blobs and you don't have to lose knowing whether the original column was an int, date, etc.. You can do the consolidation of existing historical data in batches.

Then you can pick a time interval to do consolidations, and only data for that period needs consolidated. For example, if you consolidate every hour, only the last hour's worth of data needs consolidated at a time.

You run the system-wide queries on the consolidated tables. This helps also by taking the querying load off the original tables, and it allows you to index the consolidated tables specifically for how they are queried.

Also, do not store actual db, table or column names in the consolidated detail tables. It's too huge a waste of spaces and resources. Instead, you assign every distinct name a unique integer identifier, say "name_id", and store the ids in the summary tables (db_name_id, table_name_id, column_name_id). Note that for a given name, say "Customer", you use the same name_id, regardless of whether it's a db, table or column or all three.

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

sgmunson SSC Guru Points: 110651 More actions · Answer 4

Have to agree with Scott on the use of an identifier (int value) for the table is a better choice, and you can always have a master table for the table names, or you could use the OBJECT_ID value for the table instead. Not sure I'd look to use SQL Variant data type. Potential for CONVERT issues and adds a layer of complexity to almost everything that deals with these tables and I'm not convinced it's easier than having one table per data type. One caution is to be careful not to evolve things into an EAV design (aka Entity, Attribute, Value). That can get really messy, and can be VERY difficult to query easily.

Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)

ScottPletcher SSC Guru Points: 101238 More actions · Answer 5

sgmunson - Wednesday, July 25, 2018 11:55 AM
Have to agree with Scott on the use of an identifier (int value) for the table is a better choice, and you can always have a master table for the table names, or you could use the OBJECT_ID value for the table instead. Not sure I'd look to use SQL Variant data type. Potential for CONVERT issues and adds a layer of complexity to almost everything that deals with these tables and I'm not convinced it's easier than having one table per data type. One caution is to be careful not to evolve things into an EAV design (aka Entity, Attribute, Value). That can get really messy, and can be VERY difficult to query easily.

You can't use the OBJECT_ID because it can change. You need to assign your own id that never changes.

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".