July 14, 2011 at 11:30 am
David.Poole (7/14/2011)
My data warehouse project is trying to integrate 60+ lines of business that function almost as separate companies into a unified whole (or hole). The referential integrity, domain integrity and sheer amount of bad data beggars belief. Anyone who seriously believes that all data quality should be handled by the front-end application is delusional, I have the proof.
There is a big difference between data loaded by a front-end application and data loaded by a dedicated back-end process, i.e. the ETL.
When only the ETL inserts data into the datawarehouse, it can take control of the RI, so that insert performance is not affected by foreign keys.
Of course there are exceptions to this datawarehouse scenario, such as a large distributed DW feeded by multiple processes, but for regular DWs, I'm OK with no foreign keys. And I'm able to truncate data 🙂
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
July 14, 2011 at 11:32 am
SanDroid (7/14/2011)
I have yet to see the M$ SQL environment where related Tables joined faster with no defined PK/FK relationships. 😛
Joins maybe (although I was under the impression that it was indexes that makes joins faster), but inserts are definately faster without foreign keys.
But let's compromise: before the ETL load we drop/disable the foreign keys, load the data and after the ETL we turn the FKs back on (with CHECK of course) 😀
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
July 14, 2011 at 12:06 pm
Question Guy (7/14/2011)
In over 10 years I have never found a reason to use DRI, and I probably never will. I have never had 1 single problem without using it. However I have had problems with it, but only related to replication. For my projects I learned it was a waste of time and effort and I eventually learned that it didn't seem to make much sense to put what I consider as logic in the database when it really truely belongs in the application logic. IMHO
Then you have been truly lucky or only worked in small shops. DRI is a second check to make sure your apps don't do something that they shouldn't. Hey, we are all human, we are going to screw the pooch sometimes. But when an a dev starts complaining that they are getting FKey violations I'm usually feeling pretty good, because at least the data cannot be put into a bad state. I learned long ago, in progressively bigger shops that DRI ALONE can save your butt. I would liken your argument to "I've never had a car accident so why have insurance", same thing in my mind..
CEWII
July 14, 2011 at 12:15 pm
Sometimes I use DRI but most of the time I do not.
When DRI is used, it makes truncating tables difficult. You need to do them in a very specific order. Also, when adding test data, or importing data from other sources, it also has to be installed in a specific order.
I cannot tell you how many times I get data from an SSIS package and the data at the source is not up to par. If I turn on DRI, the entire package will fail and none of the tables will load. We at least need to get the data that is available and we can work with the small quantity of errors in our reports because it only represents a small sample.
This sounds like a staging database used for the initial load and cleansing of data, which is an exceptional situation where you want maximum load efficiency and plan to scub the data afterward. For operational and data warehouses, I always used declarative referential integity constraints. Actually, as a general rule I never delete records or update keys in a data warehouse or operatonal database.
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
July 14, 2011 at 12:21 pm
I'd have to say I'm middle of the road for RI. I don't use it for every lookup under the sun, simply because of the secondary hits that are necessary during I/U statements. I do however use it when it is important to the business rules and is *non-recoverable*.
An example:
I don't bother putting RI into a state lookup. Why? If the state suddenly goes awol, I'm relatively sure I can re-insert the state, and my confidence level is high that the drop down box they used was from the existing list in the database. I try to avoid the 50 lookups that would need to happen for each change to my address lists.
However, I also maintain a surrogate key list that controls thousands of records, and the only way I could recover a missing entry would be a restore of the DB, find the surrogate key entry, and fix that. I also don't want them to be able to enter a non-existant code because of business rules. In this case, I'd enforce a reference check on the surrogate key only.
In general I do avoid RI, primarily because it interferes with a number of bulk operations and can be annoying to maintain. However, I have no problem using it for enforcing business rules when the lack of one causes more problems then the existance of it. As to significant performance hits, I've yet to see that unless it's taken to an extreme, but I guess I could go do some testing easily enough.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
July 14, 2011 at 12:28 pm
If a primary key occurs in another table, then I most always create a foreign key relationship, and if there is a simple business rule that can be applied to a column, then I'll enforce it with a check constraint.
However, for other busines rules like "a client must have a minimum of three contacts" or "transactions in each invoice must total at least $500", these are best implemented at the application layer. It's possible there are exceptions to the rule for some clients or legacy invoices entered before a certain date may not conform to that rule. Also, that type of business rule should probably be data driven using some form of rules engine.
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
July 14, 2011 at 12:31 pm
Eric M Russell (7/14/2011)
If a primary key occurs in another table, then I most always create a foreign key relationship, and if there is a simple business rule that can be applied to a column, then I'll enforce it with a check constraint.However, for other busines rules like "a client must have a minimum of three contacts" or "transactions in each invoice must total at least $500", these are best implemented at the application layer. It's possible there are exceptions to the rule for some clients or legacy invoices entered before a certain date may not conform to that rule. Also, that type of business rule should probably be data driven using some form of rules engine.
You *can* enforce these at the database layer, but via triggers, and you have to code around the idea of inserting your (for example) 3 contacts simultaneously so you don't break the rules with the first of 3 single inserts.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
July 14, 2011 at 12:40 pm
Craig Farrell (7/14/2011)
Eric M Russell (7/14/2011)
If a primary key occurs in another table, then I most always create a foreign key relationship, and if there is a simple business rule that can be applied to a column, then I'll enforce it with a check constraint.However, for other busines rules like "a client must have a minimum of three contacts" or "transactions in each invoice must total at least $500", these are best implemented at the application layer. It's possible there are exceptions to the rule for some clients or legacy invoices entered before a certain date may not conform to that rule. Also, that type of business rule should probably be data driven using some form of rules engine.
You *can* enforce these at the database layer, but via triggers, and you have to code around the idea of inserting your (for example) 3 contacts simultaneously so you don't break the rules with the first of 3 single inserts.
Yes, I can enforce any type of business rule inside a trigger, and I could also have my trigger make CLR function calls to authorize credit card transactions and have my stored procedure render HTML page content too. 🙂
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
July 14, 2011 at 1:05 pm
Eric M Russell (7/14/2011)
Yes, I can enforce any type of business rule inside a trigger, and I could also have my trigger make CLR function calls to authorize credit card transactions and have my stored procedure render HTML page content too. 🙂
*snicker* Alright, alright, point. 😛 😀
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
July 14, 2011 at 10:36 pm
{Note to self... add more questions about DRI to the interview questions I ask}
--Jeff Moden
Change is inevitable... Change for the better is not.
July 15, 2011 at 8:32 am
Jeff Moden (7/14/2011)
{Note to self... add more questions about DRI to the interview questions I ask}
I was thinking the same thing Jeff..
CEWII
July 15, 2011 at 9:56 am
Eric M Russell (7/14/2011)
Craig Farrell (7/14/2011)
Eric M Russell (7/14/2011)
If a primary key occurs in another table, then I most always create a foreign key relationship, and if there is a simple business rule that can be applied to a column, then I'll enforce it with a check constraint.However, for other busines rules like "a client must have a minimum of three contacts" or "transactions in each invoice must total at least $500", these are best implemented at the application layer. It's possible there are exceptions to the rule for some clients or legacy invoices entered before a certain date may not conform to that rule. Also, that type of business rule should probably be data driven using some form of rules engine.
You *can* enforce these at the database layer, but via triggers, and you have to code around the idea of inserting your (for example) 3 contacts simultaneously so you don't break the rules with the first of 3 single inserts.
Yes, I can enforce any type of business rule inside a trigger, and I could also have my trigger make CLR function calls to authorize credit card transactions and have my stored procedure render HTML page content too. 🙂
I know that you are trying to be funny, but I recently saw this in a real application:
IVR (interactive voice response system) calls a TSQL procedure which calls a CLR stored procedure which makes a web service call to a middle tier application server which makes an API call to a 3270 terminal emulation program which does a lookup against a remote CICS screen for customer information and returns the info (eventually) to the IVR via the same path. :blink:
They called me because response time is very inconsistent, and decided it must be a SQL Server problem.
July 15, 2011 at 9:59 am
Michael Valentine Jones (7/15/2011)
They called me because response time is very inconsistent, and decided it must be a SQL Server problem.
I think I'd have to congratulate whoever wrote that: "Sometimes it doesn't fail."
July 15, 2011 at 11:37 am
Mike Dougherty-384281 (7/15/2011)
Michael Valentine Jones (7/15/2011)
They called me because response time is very inconsistent, and decided it must be a SQL Server problem.I think I'd have to congratulate whoever wrote that: "Sometimes it doesn't fail."
The ringling brothers circus theme is playing in my head.. Rube Goldberg device? With all those moving parts how could they even remotely pin it on ANY of the components. However, I will say this, sometimes given the tools available at the time something was written it might have been the only real choice, obviously in retrospect, with newer tools, not so much.. I have code that I wrote for SQL 2000 that I would write very differently in SQL 2005+..
One of the companies I worked for acquired a product that effectively turned ISAM data into DB2 and more-or-less hid that from the application which was a CICS app. The upside was that the app didn't have to be rewritten and you could query the data directly in DB2..
CEWII
July 16, 2011 at 5:45 pm
Not using DRI and instead relying on applications to ensure business rules feels to me like not wearing underwear: you might be able to get away with it most of the time but it's extra embarrassing when you are caught with your pants down
In OLTP there is no valid reason to skip DRI it is the implementation of one of the most important Algebra of the 20th century "If A references B then B must exist" Chris Dates.
When DRI is skipped in third party application data access is generally created with ORM tool and in some cases data expert may be needed to clean up the relational model of the application.
And DRI is also not new Microsoft implementation was in 1999 the 2000 edition.
Kind regards,
Gift Peddie
Viewing 15 posts - 16 through 30 (of 35 total)
You must be logged in to reply to this topic. Login to reply