November 7, 2002 at 2:39 pm
Situation:
You have some applications, all running on separate SQL Server databases (all on the same server). These applications have some tables that store similar (but not always identical) data to each other; for instance, each may store Employee information in an Employee table.
You want to centralize the storage of all this common data into centralized tables, and at the same time ensure that constraints are enforced to all application databases, so what do you do:
a) remove the tables with shared data from your application databases and install them in a new COMMON database with all common shared tables. In each application database, implement your constraints as triggers firing to the common database to ensure integrity.
or...
b) Create a single database with the shared tables, application data tables, all together and implement your constraints and all application rules within a single, centralized database. This would mean converting all your application databases and application code into a centralized system, and implementing common data through a set of common tables with constraints to all your application data tables within this single database.
Basically, is it better to go with a large, single, centralized data store with all shared and application data tables, or go with numerous application databases tied through triggers to a separate, common database.
I welcome any and all opinions and/or expertise. Thanks so much!
Jay
November 7, 2002 at 2:47 pm
I preferred the shared approach. Common data can be best handled that way while still allowing you to generate table names for specific applications that make since. For instance an app for scheduling and one for process would not mix very well into a common table named schedule but the table name makes perfect sense for each. Plus, you will have to come up with a method to identify each table and how it relates to a particular app, so the names have to become more descriptive and include the app name. On top of all that it is not hard at at all to work with multiple databases and the readability will be much easier.
These of course are my opinion.
November 7, 2002 at 2:56 pm
Actually, we are in a situation now in which we have multiple databases, however, I was under the impression that one could not enforce a CONSTRAINT across multiple databases. Although I see your point about readability, it is not a major concern at this point. My major concerns are the following:
a) What are the performance ramifications, if any, of having to use triggers to enforce common data integrity across multiple databases since a CONSTRAINT cannot be used.
b) If I use a separate database to store common information, such as employees, office, etc that are used in all application databases, will stored procedures in the application databases that access that common data be able to cache the other database's data as efficiently as when all the data resided in a single database
I do appreciate your comments and suggestions. Thanks.
November 7, 2002 at 3:04 pm
quote:
a) What are the performance ramifications, if any, of having to use triggers to enforce common data integrity across multiple databases since a CONSTRAINT cannot be used.
When you state constriants can you be a bit more specific. If you mean PK to FK then right Trigger is your only option but then you have to weigh what the trigger does to the table in the common database how it affects the others and how to deal with that. If it is a constraint like 1 db can only have users who are an Admin then these can be added to the business logic of your SP's and processed without hitting the final table.
quote:
b) If I use a separate database to store common information, such as employees, office, etc that are used in all application databases, will stored procedures in the application databases that access that common data be able to cache the other database's data as efficiently as when all the data resided in a single database.
The execution plans are stored as they are with local objects involved and indexes in the other database are considered. I have had no troubles here.
November 7, 2002 at 3:12 pm
Point b) has been answered. Thank you.
As for point a) , yes, I was referring to a PK_ to FK_ relationship. I would have to use a trigger on the referencing table to ensure that the value existed in the referenced table (obviously, I'm trying to accomplish what a FK constraint would do for me if everything were in a single database.)
I am not sure what you mean by
quote:
If it is a constraint like 1 db can only have users who are an Admin then these can be added to the business logic of your SP's and processed without hitting the final table.
All I am trying to ascertain here is whether it is worth the performance hit (regarding the trade-off between the FK constraint and adapted "trigger constraint") to have commonly shared data reside in tables in another database.
November 7, 2002 at 3:52 pm
Just ignore that peice. I was thinking thru all the constraints and that is usually a trigger thing. I use references to control tables where they have a set range 1-100 is particular to one database, 101-200 is another. To enforce intergrity many people use PK-FK relationships or a trigger to validate. I use a Check Constraint instead to save reads to those tables for performance. For some reason I was getting it mixed in while I was typing and thinking (kinda like walking and chewing gum, ain't good at it at times).
November 8, 2002 at 7:31 am
I completely understand the walking and chewing gum dilemma. So, to sum up then, you believe it is better to have commonly shared data stored in a separate database, and have triggers enforce FK relationships to tables in other databases that use those common values.
In other words, a table of employees, with employee code, name, department, etc. would be stored in one database. That employee code would be referenced (conceptually, of course since an actual FK constraint would not be possible) by an employee code in a payroll database, a sales database, an orders database, and so on. Correct?
November 8, 2002 at 7:42 am
Sounds right to me.
December 16, 2002 at 5:24 am
quote:
In other words, a table of employees, with employee code, name, department, etc. would be stored in one database. That employee code would be referenced (conceptually, of course since an actual FK constraint would not be possible) by an employee code in a payroll database, a sales database, an orders database, and so on. Correct?
Excellent topic. I have lived with this problem for a few years now and solution is not yet implemented. But here is my situation:
Our company started porting separate appplications for Accounting, Sales, Production, ... all the stuff a serius company depends on, from Access database to SQL Server. Once databses were on the same server, clients requested that shared data should be unified and that they should be required to enter the same data only once. Since lookup tables in different applications basically had the same data, but not the same table structure or data details (some details are requred for some app, and not for some other). We started by creating triggers to synchronize relevant data. For example: a new client was entered in Accounting with all attributes (details) requred for accountants. Trigger copied the data relevant for Production (clientID, name, address, ...) to production.dbo.client table. In production they only needed to add their specific data. Syncronization goes the other way around too, including delete. At minimum the clientID must be the same to know which client we are talking about when invoices are transferred from production to accounting.
The number of "integrated" databases increased and so did trigger complexity which has reached a level of unmanagebility.
Enough about the problems, let's try solution ideas. There was a talk about cross database foreign keys before SQL Server 2000 came out. To our disapointment nothing happened. I really can't imagine a serius business application set without sharing data between separate databases. Doesn't anybody do that?
So I had to figure out something else. Common data (I'll call it look-up tables, since I don't have a better translation) should be separated from all databases in one "Look-up database". A client table should contain as much detail (attributes, fields) to cover all the applications' needs. A stripped copy of the client table should be in each database. By "stripped" I mean it should contain at least a primary key that is used for FKs. Here we solved the integrity problem. All updates to common data (look-up tables) should be done using a separate application, which has well managed security since for example client data should be mainained by a responsible person. Imagine sending money to the wrong client, because somebody (anybody) changed their bank account number. Maybe you would notice it when the client asked fot the money allready paid (or so you thought). Triggers would take care of replicating the codes (PK) from Look-up database to local copies for data integrity in each database. Each stripped look-up table would also have a view that is linked to the subset of detail needed in this applicaton. That view would be used instead of the look-up table everywhere in the application except for FK (example: ComboBox, Report).
This idea is very flexible about schema changes in each database. There is no need for complex triggers to maintain referential integrity. There is also cross database referential integrity. Example: if we try to delete one customer code that is not used in one database, trigger would try deleting it in Look-up databse and consequently in all underlying databases. If this code is used in just one database, FK violation would cause rollback.
How do you like this one? 😉
Best regards, Jani
PS: were you wondering about cascading triggers?
December 19, 2002 at 2:59 pm
I talked to a very talented Microsoft Sales rep. who designed extensive applications using SQL Server applications. We have three applications which all share common person/employee data tables and lookup tables. He suggested I put all of the application tables into one database. That typically when one separates databases is more for high-transaction/OLTP systems vs. reporting/datamart/archival systems.
He informed me that the overhead of having additional tables in the database would be minimal compared to the problems keeping the database tables updated correctly. Also, we allow our clients to seamlessly use an Access db backend, which has no triggers to maintain RI; therefore, a code solution would have to be maintained to keep the data files in sync.
The benefits include sending one update script to allow our clients to be updated to the latest data model, much less complexity than dealing w/ separate databases, and a data model solution that can be fairly similar for Access, SQL Server and Oracle. The main problem is customers that want to operate two individual systems on the same server (aka. keep their employee data separate). This is resolved by allowing the application to connect to a locally configurable datbase connnection.
Good luck and hope this helps.
December 19, 2002 at 3:33 pm
I still have to point out the one major flaw that always is overlooked with a single DB solution. Many times you may have some related data but much is sometimes not. You need to consider how they relate, who will maintain them and what specifically happens when data is lost. If you have all your tables in one DB then unrelated material will be lost and all user applications will be affected by the need to redo.
But if you have key fields that under no circumstance are generated by the DB then things are not as glum as you picture it. Also, in many DBs with related data you don't want to delete data so RI is only a verification of existance and active not a removal.
Where I work we have a DB generated id and an Employee ID we use to login to all our systems. If we used the DB generated one and several people were lost then how do we find out what ID to link with what. In our case we use the Employee ID in all related items to have a constant non-changing value for the employee which even if the record is lost that employee id is what will be keyed back in. So even if all the other DBs which reference that value no longer have it to reference we now have a way to know whos data is missing and what to put back in.
Multiple DBs can be a task to setup and administer. But they offer (if done properly) many plus such as loss to only one application with regards to data, seperation of data so users apps cannot have adverse effects on others data, more logical references based on app (which of course is for the admin to plan this), table with similar functions can have the same name even if structure is different and still others which I am working on a overview and article with regards to handling these, that I will post if I ever get off my lazy butt to finish.
December 20, 2002 at 7:18 am
I agree that the single database may be easier to set up and maintain, however, Antares686 does address some points regarding multiple database architecture which I have found very compelling in recent weeks. Much of the basis of your argument regards the maintenance of code and database administration. It may seem that having a single database is theoretically easier to maintain, however, consider the following point.
I have found that maintenance of application databases, and indeed application code, relies very heavily on how cleanly and logically business entities are represented. While this may sound like a high-horse statement, it really isn't. Antares686 is dead on when he brings up the advantage of being able have 2 identically named tables; one in a common database and another in an application database, that store separate (and also related) data.
Perhaps it is my object-oriented persuasion, but if I have a shared database called common, and a payroll database called payroll, with an employee table in each, in my SQL, if I refer to common.employee or payroll.employee, I immediately know there is a difference, and that difference is that one stores common, or shared data, the other stores application-specific data.
If I had both employee tables (or worse, one giant employee table) in a single database, the semantics of referring to each entity would be messy (SELECT emp_common.firstname, emp_payroll.ssn FROM emp_common...). The database tables themselves no longer reflect true entities, but abstractions, or parts of entities. I believe one of the great side-effects of object-oriented methodology / theory is that its phrasing is more English-like. The more English-like the semantics, the easier it is for junior programmers or DBAs new to the system to pick up on how everything fits together. The easier a person can do this, the easier a system is to maintain.
I hope a few a these points were clear enough. Thanks for the input. I'm enjoying this thread thoroughly. 🙂
Viewing 12 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply