November 15, 2007 at 8:14 am
at this point there would be the dual problems of fixing the data while still running the current applications and maybe having to make up new processes because some things don't always flow in the right order
it can be done over say 2-3 years but i think the belief is that it's cheaper to just buy new hardware which is dirt cheap these days and throw horse power at the problem or fix the data later if there is a problem since the people who do that also do other jobs that would still be required
November 15, 2007 at 8:34 am
There shouldn't be data to fix if the application is truely handling referencial integrity. The mere fact that this is a concern tells me that you have the illusion that the application is correctly handling your data integrity. Like Brian said, if your application is working as you said, adding the FKs shouldn't cause you any problems. 😀
Also, FKs help with analysis as they define relationships between data. Without them, you are stuck having to read through code to find out how your data is related. I hope your DBAs are fairly Java savvy.:Whistling:
November 16, 2007 at 4:14 am
We have something similar implemented where the database integrity is not being maintained by relationships between tables , instead they have decided to use triggers to try and address some of the issues.
trust me the trouble is not worth with . I think there is no substitue to good old PK-FK relationships.
November 16, 2007 at 6:23 am
jayanth j kurup (11/16/2007)
We have something similar implemented where the database integrity is not being maintained by relationships between tables , instead they have decided to use triggers to try and address some of the issues.trust me the trouble is not worth with . I think there is no substitue to good old PK-FK relationships.
I can see the use of triggers when there is business logic to be maintained or you have to cross databases. But otherwise you're exactly right: there's no substitute for the proper use of FKs.
K. Brian Kelley
@kbriankelley
November 16, 2007 at 6:46 am
Thanks guys .. I will keep you updated about my fight for 'Foreign Keys'
November 16, 2007 at 1:01 pm
I'll have to disagree here. I think that database design is black and white. There is an correct way to design proper, normalized relational data structures. You may at some point make exceptions with normalizations or FK constraints for performance reasons, but they should be just that, exceptions and not grey areas. I'm not sure of your job position, but if you are a database designer, architect, or administrator, question everything that your developers do 'because they like it that way'. If you are going to use a database to house unrelated, unnormalized data, why not just use spreadsheets on fast disks?
The problem is that this attitude will force the existence of those spreadsheets/access databases/word files full of data that should be in the database but can't be entered because it has to be both perfect and complete first.
More constraints equals better data quality, at the price of system usability. Less constraints equals more usability at the price of system integrity. As DBA's almost everyone will say "Well, that obvious - integrity is so much more important!" And, then you find that in the real world the users are busily working around your perfectly architected solution because they can't enter their imperfect, real world data.
Ultimately, it's a balancing act. You need enough foreign keys, data constraints, etc. to keep the data reasonably clean and enough flexibility to allow the entry of dirty, incomplete, and occasionally contradictory real world data. Where that line is drawn differs from company to company and application to application. Trying to draw the line way off to one side or the other leads to the user's disliking IT as "not getting it" and "their stuff doesn't do what we need it to" and the proliferation of all these side "projects" that undermine company wide data integrity.
It's kind of like passwords. The stronger and better and more secure the password is, the more likely it is to be written on a sticky note and pasted on the monitor for everyone to see.
Steve G.
November 16, 2007 at 1:24 pm
I respectfully disagree. Properly designed, constraints do not impact usability. In the case of dirty data, if the data is ambiguous, how do you establish relationships based on it? The short answer is you don't, really. You make some logical guesses that can be completely wrong. Example? See MySpace and the purging of the "sex offenders."
K. Brian Kelley
@kbriankelley
November 16, 2007 at 1:30 pm
The problem is that this attitude will force the existence of those spreadsheets/access databases/word files full of data that should be in the database but can't be entered because it has to be both perfect and complete first.
More constraints equals better data quality, at the price of system usability. Less constraints equals more usability at the price of system integrity. As DBA's almost everyone will say "Well, that obvious - integrity is so much more important!" And, then you find that in the real world the users are busily working around your perfectly architected solution because they can't enter their imperfect, real world data.
Ultimately, it's a balancing act. You need enough foreign keys, data constraints, etc. to keep the data reasonably clean and enough flexibility to allow the entry of dirty, incomplete, and occasionally contradictory real world data. Where that line is drawn differs from company to company and application to application. Trying to draw the line way off to one side or the other leads to the user's disliking IT as "not getting it" and "their stuff doesn't do what we need it to" and the proliferation of all these side "projects" that undermine company wide data integrity.
It's kind of like passwords. The stronger and better and more secure the password is, the more likely it is to be written on a sticky note and pasted on the monitor for everyone to see.
Steve G.
I'll agree as long as your database is nothing more than a convenient, centralized place for users to put stuff but if you are developing applications systems for resale or to be used as an enterprise wide solution and you care about performance/integrity, you'll want to put some serious time into designing your data structures not only for scalability and performance but also so they are relationaly sound.
November 16, 2007 at 4:29 pm
I respectfully disagree. (et. al.)
Y'all are certainly welcome to. And, as long as you're working in a nice, sterile environment, you'll have a lot of success. But, let's take a more real-world example: Tracking the homeless that come into or are referred to shelters. You have someone come in the door. At this point you must enter information. You get a name that is probably wrong. You may or may not get an SSN, but it's not only almost certainly wrong, it's probably the same one that 5 or 6 other people gave you (FYI, SSN's that "work" are commonly used by dozens of homeless). You may never get the correct SSN (as it may not exist). Some of them will have an address (may be right, may be wrong), but many will not. Their "address" may be 'under a bridge'. Basically, your user needs to enter the information that they have regardless of it's correctness.
Yes, there are lots of opportunities for appropriate FK and constraints in this type of application - but it would be really easy to go from appropriate to unusable trying to enforce only "correct" data entry. A good example of a bad constraint for this application would be having a non-null, unique constraint on the SSN.
The typical attitude that 'the more you tighten up a system the better it is' is one of the primary sources of the "glass house" mentality and a direct cause of users defecting from corporate systems to homegrown spreadsheets and Access applications. There's a balance between correctness and usability, just as there is with security, and it's easy to err on one side or the other.
Steve G.
November 16, 2007 at 5:58 pm
aureolin (11/16/2007)
I respectfully disagree. (et. al.)
Y'all are certainly welcome to. And, as long as you're working in a nice, sterile environment, you'll have a lot of success. But, let's take a more real-world example: Tracking the homeless that come into or are referred to shelters. You have someone come in the door. At this point you must enter information. You get a name that is probably wrong. You may or may not get an SSN, but it's not only almost certainly wrong, it's probably the same one that 5 or 6 other people gave you (FYI, SSN's that "work" are commonly used by dozens of homeless). You may never get the correct SSN (as it may not exist). Some of them will have an address (may be right, may be wrong), but many will not. Their "address" may be 'under a bridge'. Basically, your user needs to enter the information that they have regardless of it's correctness.
Yes, there are lots of opportunities for appropriate FK and constraints in this type of application - but it would be really easy to go from appropriate to unusable trying to enforce only "correct" data entry. A good example of a bad constraint for this application would be having a non-null, unique constraint on the SSN.
The typical attitude that 'the more you tighten up a system the better it is' is one of the primary sources of the "glass house" mentality and a direct cause of users defecting from corporate systems to homegrown spreadsheets and Access applications. There's a balance between correctness and usability, just as there is with security, and it's easy to err on one side or the other.
And from a design perspective you've told me that your data doesn't have a usable primary key. That means you go with a surrogate key such as an IDENTITY() field. SSN is a known bad one for trying to enforce uniqueness because even in cases where people are on the up and up you can sometimes get conflicts. So if I have a foreign key to this table, it would be to that surrogate key. Which means you can enforce referential integrity. Would you want to do this? Absolutely.
And one place it can be helpful is if you have a system where you enter in an address. The system auto looks up using Full Text or some type of fuzzy logic similar addresses, which are stored in a different table, and can help you zero in on addresses that are similar when they are given but are really the same in practice. You do the referential integrity using a join table between person and address and here's something else that suddenly can give you... If you have multiple hits against an address from person records that are really, really similar, it allows someone to evaluate and potentially merge the records together, thereby streamlining your database.
K. Brian Kelley
@kbriankelley
November 18, 2007 at 8:03 pm
Foreign keys rule in development databases.
Where I have always ran into problems with them in production. When a table has a foreign key you can't pull the table from a backup database, then rename the tables to instantly restore a table that someone killed. You also can't truncate a table with foreign keys in it without cleaning up all the attached tables first (which comes in handy if you run out of disk space you can truncate non keyed tables and get the customer back up instantly).
I guess I have had to many production outages to ever use a foreign key in a production database. Although, if you have them, there is always the option of pulling up the table in the diagraming tool and deleting them(trust me this is the fastest way when you are in a hurry).
Keep an empty SCHEMA database that has all the tables and keys in it but never move those keys into production unless you have a clear plan on what to do if there is a problem
Paul Ross
November 18, 2007 at 8:38 pm
pdross2000,
You are correct that you cannot truncate a table with a foreign key and that it may make spot restoring tables a bit more difficult but you are placing the blame in the wrong spot. Foreign keys don't cause you problems with the TRUNCATE command, they are a safety net telling you 'wait a minute!'. If you have the habit of truncating tables, I think that is a great reason to leave your FKs in place. A FK means that the data may have related data in another table. Making you go through the extra step of dropping/re-adding the FK to make TRUNCATE work just means that you are going to be absolutely sure that this data should be deleted (hence insuring the integrity of your data...integrity constraint 🙂 ). As for the spot restores, referential and data integrity are completely different from backup/restore. You should not design your data integrity rules around how easy you want it to be to spot restore data. Leaving your FKs in place only means that you have to write a couple of JOINs when restoring your data. To me, no big deal.
November 19, 2007 at 5:33 am
Another consideration is the source of the data -- the source of our DB data is a mainframe (non-relational) application from which we download biweekly. In that situation, foreign key violations would simply make our data load fail. Consequently, we take the data as it is, and rely on various audit queries to detect violations after the data has been loaded.
November 19, 2007 at 6:10 am
Jim Russell (11/19/2007)
Another consideration is the source of the data -- the source of our DB data is a mainframe (non-relational) application from which we download biweekly. In that situation, foreign key violations would simply make our data load fail. Consequently, we take the data as it is, and rely on various audit queries to detect violations after the data has been loaded.
What I've seen done in this case is to follow the 3 step process: Extract, Transform, and Load (ETL). The transform stage cleans up the data, making sure it conforms and is as clean as possible, before it is loaded into the warehouse.
K. Brian Kelley
@kbriankelley
November 19, 2007 at 7:26 am
I've inherited a system in which all the foreign keys are disabled and triggers are being used to enforce referential integrity. The nice part about this setup is that errors can be handled more gracefully in a trigger and returned to the application in a more useful form. Having the foreign keys in place, but disabled, makes it easier to maintain an ERD. Not quite sure about the downside of this setup yet since I haven't been able to profile a performance comparison of triggers vs. foreign keys with this application. Anyone have any thoughts?
Viewing 15 posts - 31 through 45 (of 87 total)
You must be logged in to reply to this topic. Login to reply