Referential Integrity: Who needs it - right? (Discussion)

  • So - I'm reflecting on a project that is winding down, and some of the things I noticed on it. Lets just say that I did my best to implment best practices where I could, and had to practically fight people over what little that I could get put into the database(appropriate DiskIO via Filegroups, making sure table partitioning was even installed correctly, appropriate backup routine and why that is how to manage log files instead of using a shrink job on a routine basis, etc, etc...).

    The one thing I was never able to overcome was the fact that the database was flatter than Kansas, and with the exception of 15 or so tables, nearly all others had no Foreign Keys at all (and trust me when I say that they very well could have). The reason I was given for this practice was sad and bothersome to me all at the same time. "We don't like Foreign Keys - they are a pain", and "we let the app take care of the Fkey relationships". The other consideration was their desire for speed over the accuracy of the data. Yes - you read that correctly. :crazy:

    So let me ask the experts this question...Can the front-end application to your database honest and truly take care of the data going into the database, such that you can get away without putting RI in the database itself? I am beginning to wonder if this is a platform dependent situation, or if there is any merit to it at all. The front-end is not a .NET web app, but one that was programmed in Grails (Ruby on Rails for JAVA I think...).

    Sorry if this is the wrong place for this, but it is a SQL Server 2005 Enterprise Edition question for discussion, and it's on SP3 (if that makes any difference).

  • As soon as any other application accesses that database and updates data - you will have data integrity issues. In other words, the only way to insure data integrity is to make sure that only that application has access to the database.

    And, it requires that the application has the correct information and it is coded appropriately. That is probably (most likely) not the case and in a very short time there will be data integrity issues and you will be asked to build scripts to automate the cleanup - because, to fix it the developers would have to change the application, re-deploy the application and you still have to perform the cleanup.

    Any other application means: SSMS, SSIS, SSRS, Access, Excel, or any other utility that uses ODBC/OLEDB to connect to the database and the user connecting has privileges to change data.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • I think that people who insist that the application can take care of RI are just about as flambouyantly in desregard (and equally wrong) for proper database design as those that insist on "preoptimization is the root of all evil". They just don't understand the long term damage that will occur when these things are allowed to be driving forces in database design.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • SQL_ME_RICH (5/4/2012)


    The other consideration was their desire for speed over the accuracy of the data. Yes - you read that correctly. :crazy:

    :w00t: Yes, absolutely crazy. Nothing should come in front of the quality of the data.

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • Thanks for the input on this...The reason I asked this question to begin with is that the only true discussion I was able to find was out in an Oracle forum...http://database.itags.org/oracle/68608/?title=oracleenforcing+referential+integrity+database+or+application

    Jeff - are you saying that those who believe that it's pre-mature to try and optimize your queries before you even have data in your database are off? Hence the quote from that person (whose name escapes me now) 'Pre-optimization is the root of all evil' is wrong? I was kinda thinking that the quote was correct, but if not - I am curious as to your take to why it wouldn't be.

    That aside - I think it has been a really tough project that I would never want to do again. There was no project plan (nor project manager for this project). We were literally flying by the seat of our pants due to human constraints (someone needing to leave the US by the end of May), and physical assets that were at risk (nearly out of disk space with no where to go for the PROD system that was housing this current db). Frankly - it was a mess, and would have been for anyone who took this assignment.

    But the whole issue with the FKey left me just wondering if I hadn't yet learned something that might have just been - differernt. Glad to see my lessons learned early on have proven to be correct about that need.

  • PaulB-TheOneAndOnly (5/6/2012)


    SQL_ME_RICH (5/4/2012)


    The other consideration was their desire for speed over the accuracy of the data. Yes - you read that correctly. :crazy:

    :w00t: Yes, absolutely crazy. Nothing should come in front of the quality of the data.

    I couldn't agree more. Performance matters little if your data is incorrect or of such low quality that it is unusable.

    Anyone arguing they can enforce RI in the app layer better than in the DB is speaking out of a mix of ego and a disdain for DB development.

    I read some of the comments from the Oracle thread posted and one argument that does seem to have some merit is the protection of IP. Business logic implemented in a database a customer has the option to host internally is far more exposed to prying eyes than compiled application binaries.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • it sounds like you might be dealing with developers that are "long in the tooth".

    possibly developers that primarly developed on ISAM databases.

    there will be no arguing with them... 😛

  • Jeff Moden (5/5/2012)


    I think that people who insist that the application can take care of RI are just about as flambouyantly in desregard (and equally wrong) for proper database design as those that insist on "preoptimization is the root of all evil". They just don't understand the long term damage that will occur when these things are allowed to be driving forces in database design.

    Hmmm.... I teach that premature optimisation is the root of all evil. I'd like to think I have a slight clue as to what I'm talking about.

    http://sqlinthewild.co.za/index.php/2010/03/11/the-root-of-all-evil/

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (5/7/2012)


    Jeff Moden (5/5/2012)


    I think that people who insist that the application can take care of RI are just about as flambouyantly in desregard (and equally wrong) for proper database design as those that insist on "preoptimization is the root of all evil". They just don't understand the long term damage that will occur when these things are allowed to be driving forces in database design.

    Hmmm.... I teach that premature optimisation is the root of all evil. I'd like to think I have a slight clue as to what I'm talking about.

    http://sqlinthewild.co.za/index.php/2010/03/11/the-root-of-all-evil/

    Yea... me too. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Let's put it this way, Gail. If you're one of the folks that thinks there's something seriously wrong with allowing front end developers to "design" tables using NVARCHAR(4000) or NVARCHAR(MAX) for every character based column there is in a table, then you're probably teaching the right stuff. I've not had the pleasure of seeing someone who touted the "premature optimization" banner do it the right way, so far.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (5/7/2012)


    Let's put it this way, Gail. If you're one of the folks that thinks there's something seriously wrong with allowing front end developers to "design" tables using NVARCHAR(4000) or NVARCHAR(MAX) for every character based column there is in a table, then you're probably teaching the right stuff.

    I certainly don't approve that, but that's not about premature optimisation. That's just good solid database design.

    Premature optimisation is letting untested performance assumptions drive the design of a system. An example in front-end code would be something like 'loops are slow, so I'll unroll the loop manually in my code'. The result is a terribly messy piece of code, and maybe 1% performance improvement at best. In database design it could be 'I'm denormalising for performance'

    It's the result of making untested assumptions about what might be slow before writing the code, and letting that assumption drive the design in directions it shouldn't go, and that I consider both a waste of time and a cause of design problems in the long run.

    My argument (as given in my blog post) is that one should rather code first, then measure, then optimise if necessary.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Has anybody come across the CADIS Enterprise Data Management product?

    Where I work they always insist that any import tables for third party text data are nvarchar(4000) and there is no RI as that is what is recommended.

    So, sometimes it's a third party vendor that forces it upon you and senior management buy into it regardless of how many examples you present or how hard you argue.

  • Ok. Now I see where you're coming from and THAT's what I envisioned as "premature optimization".

    The problem is how certain mantra's get bastardized over time. I've worked with a couple of supposedly "hot" front end developers lately and they designed tables like the one I cited. When I asked them "Why?", their answer was "Premature optimization is the root of all evil." It was all I could do to keep from throwing them out the window right then and there.

    I absolutely agree with the "code and measure" philosophy optimizing only those things that don't measure up and aren't prone to causing a problem down the road.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • jasonmorris (5/8/2012)


    Has anybody come across the CADIS Enterprise Data Management product?

    Where I work they always insist that any import tables for third party text data are nvarchar(4000) and there is no RI as that is what is recommended.

    Import tables get a pass in this area, usually. CADIS' position on using wide string columns in an import table is not necessarily an indication of a bad design or avoiding premature optimization.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • opc.three (5/8/2012)


    jasonmorris (5/8/2012)


    Has anybody come across the CADIS Enterprise Data Management product?

    Where I work they always insist that any import tables for third party text data are nvarchar(4000) and there is no RI as that is what is recommended.

    Import tables get a pass in this area, usually. CADIS' position on using wide string columns in an import table is not necessarily an indication of a bad design or avoiding premature optimization.

    I agree with that if what you mean by "import" tables is the same as what I call "staging" tables.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 18 total)

You must be logged in to reply to this topic. Login to reply