DRI or No DRI?

Question

DRI or No DRI?

Viewing 15 posts - 16 through 30 (of 35 total)

You must be logged in to reply to this topic. Login to reply

Koen Verbeeck SSC Guru Points: 259115 More actions · Answer 1

David.Poole (7/14/2011)
My data warehouse project is trying to integrate 60+ lines of business that function almost as separate companies into a unified whole (or hole). The referential integrity, domain integrity and sheer amount of bad data beggars belief. Anyone who seriously believes that all data quality should be handled by the front-end application is delusional, I have the proof.

There is a big difference between data loaded by a front-end application and data loaded by a dedicated back-end process, i.e. the ETL.

When only the ETL inserts data into the datawarehouse, it can take control of the RI, so that insert performance is not affected by foreign keys.

Of course there are exceptions to this datawarehouse scenario, such as a large distributed DW feeded by multiple processes, but for regular DWs, I'm OK with no foreign keys. And I'm able to truncate data 🙂

Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP

Koen Verbeeck SSC Guru Points: 259115 More actions · Answer 2

SanDroid (7/14/2011)
I have yet to see the M$ SQL environment where related Tables joined faster with no defined PK/FK relationships. 😛

Joins maybe (although I was under the impression that it was indexes that makes joins faster), but inserts are definately faster without foreign keys.

But let's compromise: before the ETL load we drop/disable the foreign keys, load the data and after the ETL we turn the FKs back on (with CHECK of course) 😀

Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP

Elliott Whitlow SSC Guru Points: 102402 More actions · Answer 3

Question Guy (7/14/2011)
In over 10 years I have never found a reason to use DRI, and I probably never will. I have never had 1 single problem without using it. However I have had problems with it, but only related to replication. For my projects I learned it was a waste of time and effort and I eventually learned that it didn't seem to make much sense to put what I consider as logic in the database when it really truely belongs in the application logic. IMHO

Then you have been truly lucky or only worked in small shops. DRI is a second check to make sure your apps don't do something that they shouldn't. Hey, we are all human, we are going to screw the pooch sometimes. But when an a dev starts complaining that they are getting FKey violations I'm usually feeling pretty good, because at least the data cannot be put into a bad state. I learned long ago, in progressively bigger shops that DRI ALONE can save your butt. I would liken your argument to "I've never had a car accident so why have insurance", same thing in my mind..

CEWII

Eric M Russell SSC Guru Points: 125562 More actions · Answer 4

Sometimes I use DRI but most of the time I do not.
When DRI is used, it makes truncating tables difficult. You need to do them in a very specific order. Also, when adding test data, or importing data from other sources, it also has to be installed in a specific order.
I cannot tell you how many times I get data from an SSIS package and the data at the source is not up to par. If I turn on DRI, the entire package will fail and none of the tables will load. We at least need to get the data that is available and we can work with the small quantity of errors in our reports because it only represents a small sample.

This sounds like a staging database used for the initial load and cleansing of data, which is an exceptional situation where you want maximum load efficiency and plan to scub the data afterward. For operational and data warehouses, I always used declarative referential integity constraints. Actually, as a general rule I never delete records or update keys in a data warehouse or operatonal database.

"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

Evil Kraig F SSC Guru Points: 100851 More actions · Answer 5

I'd have to say I'm middle of the road for RI. I don't use it for every lookup under the sun, simply because of the secondary hits that are necessary during I/U statements. I do however use it when it is important to the business rules and is *non-recoverable*.

An example:

I don't bother putting RI into a state lookup. Why? If the state suddenly goes awol, I'm relatively sure I can re-insert the state, and my confidence level is high that the drop down box they used was from the existing list in the database. I try to avoid the 50 lookups that would need to happen for each change to my address lists.

However, I also maintain a surrogate key list that controls thousands of records, and the only way I could recover a missing entry would be a restore of the DB, find the surrogate key entry, and fix that. I also don't want them to be able to enter a non-existant code because of business rules. In this case, I'd enforce a reference check on the surrogate key only.

In general I do avoid RI, primarily because it interferes with a number of bulk operations and can be annoying to maintain. However, I have no problem using it for enforcing business rules when the lack of one causes more problems then the existance of it. As to significant performance hits, I've yet to see that unless it's taken to an extreme, but I guess I could go do some testing easily enough.

- Craig Farrell

Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA

Eric M Russell SSC Guru Points: 125562 More actions · Answer 6

If a primary key occurs in another table, then I most always create a foreign key relationship, and if there is a simple business rule that can be applied to a column, then I'll enforce it with a check constraint.

However, for other busines rules like "a client must have a minimum of three contacts" or "transactions in each invoice must total at least $500", these are best implemented at the application layer. It's possible there are exceptions to the rule for some clients or legacy invoices entered before a certain date may not conform to that rule. Also, that type of business rule should probably be data driven using some form of rules engine.

"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

Evil Kraig F SSC Guru Points: 100851 More actions · Answer 7

Eric M Russell (7/14/2011)
If a primary key occurs in another table, then I most always create a foreign key relationship, and if there is a simple business rule that can be applied to a column, then I'll enforce it with a check constraint.
However, for other busines rules like "a client must have a minimum of three contacts" or "transactions in each invoice must total at least $500", these are best implemented at the application layer. It's possible there are exceptions to the rule for some clients or legacy invoices entered before a certain date may not conform to that rule. Also, that type of business rule should probably be data driven using some form of rules engine.

You *can* enforce these at the database layer, but via triggers, and you have to code around the idea of inserting your (for example) 3 contacts simultaneously so you don't break the rules with the first of 3 single inserts.

- Craig Farrell

Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA

Eric M Russell SSC Guru Points: 125562 More actions · Answer 8

Craig Farrell (7/14/2011)
Eric M Russell (7/14/2011)
If a primary key occurs in another table, then I most always create a foreign key relationship, and if there is a simple business rule that can be applied to a column, then I'll enforce it with a check constraint.
However, for other busines rules like "a client must have a minimum of three contacts" or "transactions in each invoice must total at least $500", these are best implemented at the application layer. It's possible there are exceptions to the rule for some clients or legacy invoices entered before a certain date may not conform to that rule. Also, that type of business rule should probably be data driven using some form of rules engine.
You *can* enforce these at the database layer, but via triggers, and you have to code around the idea of inserting your (for example) 3 contacts simultaneously so you don't break the rules with the first of 3 single inserts.

Yes, I can enforce any type of business rule inside a trigger, and I could also have my trigger make CLR function calls to authorize credit card transactions and have my stored procedure render HTML page content too. 🙂

"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

Evil Kraig F SSC Guru Points: 100851 More actions · Answer 9

Eric M Russell (7/14/2011)
Yes, I can enforce any type of business rule inside a trigger, and I could also have my trigger make CLR function calls to authorize credit card transactions and have my stored procedure render HTML page content too. 🙂

*snicker* Alright, alright, point. 😛 😀

- Craig Farrell

Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 10

{Note to self... add more questions about DRI to the interview questions I ask}

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Elliott Whitlow SSC Guru Points: 102402 More actions · Answer 11

Jeff Moden (7/14/2011)
{Note to self... add more questions about DRI to the interview questions I ask}

I was thinking the same thing Jeff..

CEWII

Michael Valentine Jones SSC Guru Points: 64818 More actions · Answer 12

Eric M Russell (7/14/2011)
Craig Farrell (7/14/2011)
Eric M Russell (7/14/2011)
If a primary key occurs in another table, then I most always create a foreign key relationship, and if there is a simple business rule that can be applied to a column, then I'll enforce it with a check constraint.
However, for other busines rules like "a client must have a minimum of three contacts" or "transactions in each invoice must total at least $500", these are best implemented at the application layer. It's possible there are exceptions to the rule for some clients or legacy invoices entered before a certain date may not conform to that rule. Also, that type of business rule should probably be data driven using some form of rules engine.
You *can* enforce these at the database layer, but via triggers, and you have to code around the idea of inserting your (for example) 3 contacts simultaneously so you don't break the rules with the first of 3 single inserts.
Yes, I can enforce any type of business rule inside a trigger, and I could also have my trigger make CLR function calls to authorize credit card transactions and have my stored procedure render HTML page content too. 🙂

I know that you are trying to be funny, but I recently saw this in a real application:

IVR (interactive voice response system) calls a TSQL procedure which calls a CLR stored procedure which makes a web service call to a middle tier application server which makes an API call to a 3270 terminal emulation program which does a lookup against a remote CICS screen for customer information and returns the info (eventually) to the IVR via the same path. :blink:

They called me because response time is very inconsistent, and decided it must be a SQL Server problem.

Mike Dougherty-384281 SSCrazy Points: 2794 More actions · Answer 13

Michael Valentine Jones (7/15/2011)
They called me because response time is very inconsistent, and decided it must be a SQL Server problem.

I think I'd have to congratulate whoever wrote that: "Sometimes it doesn't fail."

Elliott Whitlow SSC Guru Points: 102402 More actions · Answer 14

Mike Dougherty-384281 (7/15/2011)
Michael Valentine Jones (7/15/2011)
They called me because response time is very inconsistent, and decided it must be a SQL Server problem.
I think I'd have to congratulate whoever wrote that: "Sometimes it doesn't fail."

The ringling brothers circus theme is playing in my head.. Rube Goldberg device? With all those moving parts how could they even remotely pin it on ANY of the components. However, I will say this, sometimes given the tools available at the time something was written it might have been the only real choice, obviously in retrospect, with newer tools, not so much.. I have code that I wrote for SQL 2000 that I would write very differently in SQL 2005+..

One of the companies I worked for acquired a product that effectively turned ISAM data into DB2 and more-or-less hid that from the application which was a CICS app. The upside was that the app didn't have to be rewritten and you could query the data directly in DB2..

CEWII

Gift Peddie SSC Guru Points: 73570 More actions · Answer 15

Not using DRI and instead relying on applications to ensure business rules feels to me like not wearing underwear: you might be able to get away with it most of the time but it's extra embarrassing when you are caught with your pants down

In OLTP there is no valid reason to skip DRI it is the implementation of one of the most important Algebra of the 20th century "If A references B then B must exist" Chris Dates.

When DRI is skipped in third party application data access is generally created with ORM tool and in some cases data expert may be needed to clean up the relational model of the application.

And DRI is also not new Microsoft implementation was in 1999 the 2000 edition.

Kind regards,
Gift Peddie