Recommend Design Approach (Normal Form vs Performance)

  • In terms of normalization vs performance...

    I have data in a table (call it Company.Address) and I have another table that *optionally* overrides the data (call it CompanyBranch.Address) - ie: If the CompanyBranch.Address IS NULL then use the Company.Address as the default.

    Should I put the address data in a separate table and join using a FK or just leave it in the existing two tables.

    According to the rules of normalization I should split out the Address data, but it seems easier to keep it like it is.

    Any ideas on the best approach? NB: The real life scenario involves 23 columns of data.

    TIA (my 1st post - Hi everyone!)

  • Welcome to the forums. Keep your hands and feet inside the vehicle at all times.

    Performance wise, you have to reference two tables right now. Plus the fact that you can have the same address listed x number of times with different spellings on the street name or whatever. Instead, you probably should have two different tables, an Address table, for all addresses where you can uniquely identify them, reducing the number of rows you're maintaining and actually INCREASING performance, and a join table to associate these addresses with a company. You'll want to have an address type in the address table to differentiate between the listed address and the overriding address, but you should be able to get just fine performance out of it and increased ease of maintenance, data integrity, clarity in the system. It sure seems like a win to me.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Like Grant, I would normally design a database to store all address information in one place. No need to maintain two tables with the same columns in them. Then just have an ID number join between companies and addresses (this concept can be expanded quite easily to keeping address IDs for customers, companies, employees, etc., with all the address data in one place).

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Thanks for the input.

    The real life issue is more complex. There are 23 attributes (columns) of which any 1 or more can be overridden at the lower level.

    A true 3NF would have a table for each of these overridable attributes. But it just gets plain ugly.

  • I'd leave it as it is.

    I mean, if they are addresses, they should be normalized, but to override a description or any other simple attribute, such design is OK.

    Normalized model would be like:

    select coalesce(CompanyBranch_Address.Field,Company_Address.Field) as Field,....

    from dbo.CompanyBranch left outer join dbo.addresses as CompanyBranch_Address on (CompanyBranch_Address.id=CompanyBranch.address_id)

    join dbo.Company on (...)

    left outer join dbo.addresses as Company_Address on (Company_Address.id=Company.address_id)

    However, you have to be carefull.

    In this case, you should not allow nulls in addresses and have default 0 or empty string on fields that you can override.

  • Are the columns individually overridden? If so and you're planning on storing a sparsely populated table, I sure hope you don't have do searches on the data. Without the filtered indexes of 2008, performance will stink. A more normalized approach, while it could be more complex, will at least index well. Something to consider.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • CypherOz (5/13/2008)


    Thanks for the input.

    The real life issue is more complex. There are 23 attributes (columns) of which any 1 or more can be overridden at the lower level.

    A true 3NF would have a table for each of these overridable attributes. But it just gets plain ugly.

    I must be misunderstanding you. It sounds like:

    Company Address:

    Vanderbilt Construction

    1234 E West Ave

    Studio J

    Los Angeles, CA 90000

    Branch Address:

    Vanderbilt Las Vegas Construction

    5678 W East St

    Las Vegas, NV 88000

    Since the "Address2" field (second line of street address) is "Studio J" for the Company, and null for the Branch, would the branch address end up being returned as the following?

    Vanderbilt Las Vegas Construction

    5678 W East St

    Studio J

    Las Vegas, NV 88000

    That's what "attribute by attribute overrides" (which seems to be the idea here) would seem to do.

    Since that would make no sense at all in the real world, I'm assuming I must be misunderstanding something about your scenario. Can you give some examples?

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Grant Fritchey (5/14/2008)


    Are the columns individually overridden?

    Yes. The address example was not real life. The actual columns (sub set) are:

    - TareMass

    - GrossMass

    - CoupledLength

    - DrawGearRating

    - NominalWheelDiameter

    etc.

    Class of Rail Vehicle (the default) and Specific Vehicle in a Class. They are basically independent values.

    Thanks.

  • Last question, the columns are dependently or independently overriden? Meaning, if you override one, you may override another at another time, but those overrides are not related? By putting them all in a matching table and in a single row, you're implying a relationship on the overrides between each of the overridden columns. That may not be a big deal or it could be a serious issue down the road. I'm still leaning towards finding a mechanism for a more normalized design, but certainly not 100% 5NF.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply