Normalization

  • Given the enormous respect I have for all you guys on SQLServerCentral I wanted to run this by you. I am going to admit to a database design no no.

    You see I do a lot of development as well as administration and when I have my database schema nicely normalized I often find that I deliberately insert an extra column into tables to act as the primary key and I then replace the foreign keys in related tables as required. Quite often this is an identity column.

    I find this has a numebr of advantages. If I have a compound key I don't have to do any extra work at the GUI to keep track of this. It also save space in the tables containing the foreign key and the indexes are not as wide. I do of course need to add an extra constraint to enforce uniqueness but this has not been a problem.

     

    I find it is a great help when debugging code as well to have a unique number for each row.

    Your thoughts please?

     

    Nigel Moore
    ======================

  • I truly think that only the real purists would flip out over using a column with an IDENTITY property. While at PASS in Seattle I asked many Microsoft engineers on the SQL server architecture team (ok, it was only 3 or 4) and they all used columns with the IDENTITY property and one more they said they rarely use clustered indexes on anything but a column with the identity property because of the gains in insert speed. I don't put an IDENT in every table but use it quite often.

     

    ------------

    Ray Higdon MCSE, MCDBA, CCNA

  • my 2ct:

    Thé main advantage is 'meaningless primary keys' !

    Some use identity others use uniqueidentifier to do this.

    Disadvantage of uniqueidentifier :

       - clustering on it (you'll need to do more (rebuild) for less  )

       - space

    Advantage of uniqueidentifier :

        - truly unique (ms-guaranteed) and meaningless

    Disadvantage of identity :

       - one might be tempted to perform a ">" or "order by" to determine

             younger to older

    advantage of identity : 

        - sequence (clustering with heavy insert rate)

        - natural (on can better understand and memorize it then with guids)

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • The question about the (ab)use of Identity Columns as Keys...

    My standpoint is:

    Don't use them!. - Except: - You have a good understanding of relational database theory AND You know what you are doing, use them IF you have a good motivation for doing it.

    For Practical reasons I also often adopt the approach described for designing PK & FK on tables. i.e. Using an Identity column as PK for tables and using one or more UC to enforce data integrity.

    It is Imperative that all the Keys (Unique Constraints) also be declared on the tables to ensure data integrity.

    An Identity, Unique Identifiers or other "meaningless" identifier cannot serve as a Key ( Candidate, Alternate or Primary ) on a table. It only constrains that each row is unique, and does not enforce data integrity by itself.

    The approach of declaring an Identity column or other "meaningless" identifier as a PK on a table is a physical design choice that is made for a certain database implementation.

    /rockmoose


    You must unlearn what You have learnt

  • Ok, my 2Cents

    I will absolutely never use an Identity Column, whether as a Unique PK or not.  This is only my opinion.  I know all the ins and outs of using Identity Columns, however, I despise them because I Develop "Synchronization" programs.  Try making a Synchronization with Identity Columns, it's NOT easy!

    Anyway, if a Table I have is a Top-Level Parent Table, and there are no Columns I can use as a Combination into a PK, then I will typically use a GUID (ex.  Users, or Companies tables).

    I will never use Composite Keys!  It's just too much of a headache (I am also a Developer and understand simplistic coding makes for more reliable coding).  In the Case where a Composite Key might be used, I will instead Concatenate the Columns I need as the PK:

    Ex: 
    ProjectID = CompanyID + '-' + SeqNo
    ProjectModule = ProjectID + ':' + ProductID + '-' + SeqNo).

    Usually the Concatenation of Columns into a PK are not that big anyway.

    However, if a Table will not have a lot of Records, and will be easy to maintain manually, I will typically use a short VarChar(10) as a PK (ex.  ProductCategories).

    I'm sure someone will tell me all this is wrong, but you know what...I've never had problems maintaining data this way and it's been very easy to understand even with GUIDs.

  • There's nothing inherently wrong with using surrogate keys.  Even most so-called "purists" have no problem with it as long as uniqueness and integrity is enforced using constraints and the surrogate key is not supposed to enforce anything.  The insert performance gained by clustering on an identity is very useful, as well.

    As for not using composite keys and playing games to get around it, I'm glad it works for you but I think you're giving yourself a lot more of a headache than you'd get just by using the DBMS they way it was designed to be used.  Composite keys are certainly not any more difficult to use than single-column keys and do not require that you write concatenation code for every table in your database.  And by the way, I'd guess that most of us also develop code outside of the database.

    --
    Adam Machanic
    whoisactive

  • I agree with a lot of points that have been brought up, especially Amachanic's about workarounds. The next version of SQL server has a rownumber function and I think that may eliminate some of the niceness of having an identity property and might reduce it's use.

    ------------

    Ray Higdon MCSE, MCDBA, CCNA

  • In defense of identity columns:

    I've found that identity columns are key when summarizing data in RealTime.  I use what I call a "Collector" to do this.  A table called "CollectorDatabase" stores the CollectorID, the DatabaseID, and the MinID and MaxID.  The Database Table stores the Name and a corresponding ID (an identity , the Collector table stores the Collector Code, The Source Table Name, and identity Column Name.  Using these three tables I can incrementally summarize the data in a table, then display the summarized data in real time.

    The main advantage is that each row is only read ONCE and summarized ONCE.  Effectively I can summarize millions of rows in real time.  It's very scalable, too, the more often you run it the faster it gets.

    Personally, I would never use a varchar value (for example) as a PK or Foreign Key.  They just don't perform well!  I would use a Unique index on the varchar value to constrain the data, not a Primary Key.  identity integers perform really well. 

    Signature is NULL

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply