Data Referential Integrity - when do you use it?

  • When I'm building databases sometimes I use DRI and sometimes I don't.

    I tend to use it when there are going to be many different ways of connecting to the database because I need the database to look after itself. i.e. don't let a numpty delete a crucial record from a table.

    I tend to avoid it if the database is going to be connected by only one application.

    I'm curious as to when/if other database designers use it and what is it that provokes their decision?

  • Just because the database is only used by one application, doesn't mean a 'numpty' (or a DBA) couldn't use the likes of Query Analyser to delete a crucial record.

    If there is an inherent relationship in the data, then I would always use referential integrity.

    Think of what would happen if a parent record was deleted. How would the application behave if it found child records with no parent? What would you do with the orphaned child records? etc etc

    The only time I would consider not using DRI is for performance reasons. Even then, there would have to be a rock-solid process for handling any broken integrity.

  • One Word - Always

    In 20 + years, yes there were databases over 20 years ago that had RI. I have never created a database that did not use RI.

  • Referential Integrity maintained by the Server is much more performant than RI maintained by triggers, stored procedures or an application.

  • On this site ages ago there was someone who was running two low grade servers in a profitable dot com and was getting very good performance out of them because the active server didn't have any indices, DRI etc on it.

    Next day he would make the 2nd machine active whilst the 1st would do all the necessary RI checks and processing.

    At the time he posted a link to the Microsoft site that described how best to set this up but I cannot find it now.

    One of the situations I have here is a data-warehouse where data is received from different places at different times. I have to add records that would break DRI therefore I don't use it in this scenario. I have exception reports to warn me of the dodgy values.

  • I was mulling this one too ...

    I've got a master database which is replicated to a webserver (scheduled transactional). The database on the webserver is effectively read only - the only process that should ever insert, delete or update is the replication agent.

    I've already set the Distribution agent properties to ignore certain errors - as it seemed that transactions that were ultimately legitimate were breaking RI at some midpoint.

    I reckon it would be safe to remove all relationships and constraints from the database on the webserver, and can only guess that this would improve performance during the replication. I haven't done this because there is currently no performance problem, but i'ld be interested in your thoughts.

  • We always use RI. I can't imagine not using it.

    We have a third-party application that doesn't use it. We have a lot of problems with it from performance to data integrity. I'm sure there are other reasons for these difficulties other than the lack of RI, but I think that's a big one.

    The non-use on a read-only database is intriguing, but you could turn the read-only property on, and maybe this would give you the same return on performance. I haven't tested it, but it's a thought.

  • As we all know 'I' (via the constraint object)is enforced by SQL Server whilst the data is still in the data cache, and has been prompted (wanted to say triggered but realise that is bad word) by the change in data value. Now if the data can only be changed by replication, there seems very little point in having enforced integrity.

    However, like planet115 thinking. If there is no performance issue at the mo, why change anything.

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply