Serveral nullable columns in a table

  • To make the long story short, we have a situation where there can be different datatypes for a column.  For example, 'AttributeValue' can be a datetime, int, float etc.  My DBA is suggesting the following design,

    tblAttribute

    AttributeId, AttributeDataTypeId, AttributeValueInt, AttributeValueFloat, AttributeValueDateTime etc

    For any AttributeId, only one of the value columns will have a value and the rest will be NULLs.  I feel that this is a bad design.  (FYI, We decided against SQLVariant for specific reasons).  I have suggested, we create a seperate table for each datatype, like

    tblAttribute

    AttributeId, AttributeDataTypeId

    tblAttributeValueFLOAT

    AttributeId INT, AttributeValue FLOAT etc for each datatype.  Depending on the AttributeDataTypeId in the Attribute table go to the specific table and pick up the value.  This way, we don't need to have too many NULLs in a row.  Which of the 2 do you think is a better design?  And if you agree with me, what would be the points to support it.

    Your time is greatly appreciated

     

  • It is normalization vs. denormalization issue.

  • For me, the first thing to consider is if I need the values to be stored in their native datatypes within SQL Server. If you do not need to do joins or searches based on the information being stored, then I would probably store the information in a varchar or nvarchar field, and convert it on the way out when needed (often, the receiving application may be able to handle the conversions itself, or they may be unnecessary even then).

    If you truly need to retain the original datatype, then other questions come to mind. How many rows would this table/these tables have? Is there a particular datatype that is going to be prevalent?

    Personally, if there are only going to be a few hundred rows, I'd go with a single table. If there will be thousands of rows, but 85% will be dates, I'd consider a date-specific table and a general table. And so on.

    Sorry if this is too late to be relevant to the original poster.


    R David Francis

  • David-

    Thanks for your reply.  To answer your questions,

    We need to keep the datatypes as we do a lot of comparisons and math on these values.  So conversion may be not be an optimized solution. 

    The rows in these tables will be in millions. 

    Prevelance, I would say would be mostly float.  But the next would be dates, integers and booleon respectively.

    Having said this, please let me know what your choice would be.

    cbaru

  • Double hurm. Typed up a response, and it got eaten. Let's try again....

    First, I'll mention that, with an expectation of million of rows, I'm inclined to question how well a general-purpose storage table really meets your needs. These values must be identified in some way to be compared, etc., and it might be best to have the table structure tied more to the business rules. However, I don't know the full details of your situation, and you and your DBA seem to agree on that point, so let's move on.

    My understanding is that most data types, and certainly the ones you mention (integer, float, datetime, bit) take up the same amount of space regardless of whether they are NULL or not. That means that you'd have millions of empty datetime or integer values. These don't take up a lot of space, but you're still probably going to run at least twice as much space as you need with a single table for all data types.

    Side note: the BIT data type is not nullable - it must have a value of 0 or 1. To allow 0, 1, or NULL, use TINYINT and a constraint, or some such. This may impact the feasibility of the single table for all datatypes approach.

    A single table probably has some benefits in terms of maintenance. Beyond that, I can't think of much to recommend it.

    That said, your DBA may well have specific reasons why he's recommending his solution. Experience with both methods may show that the space issues are dwarfed by the relative ease of maintenance. He may foresee needs for multiple values in a single row, which would tend to make the single table solution much more appealing.

    Hopefully, your DBA is approachable. I'd recommend going to him/her with your concerns, and ask for help in understanding his recommendation. Allow for both the possibility that he has overlooked some problems, and that you have overlooked some advantages, to his approach. As long as it's clear you both have the same goal, success for the project and best performance/bang for your buck out of the DB, this should simply be a matter of making sure everyone understands all the pros and cons. Make it clear this is not an assault, but an opportunity for the two of you to better understand each other's concerns.

    Hope this helps.


    R David Francis

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply