Design question - FK column to more than one table

Question

Design question - FK column to more than one table

William Plourde

SSCrazy

Points: 2035
More actions
March 1, 2012 at 7:38 am

#256891

I've seen this before and I've had to do this again lately and I can't think of an alternative strategy. Basically I have a column that contains a foreign key to more than one table. Because it may be a key to one of N tables the column cannot have a foreign key and we have to add a code column so we know which table the fk is for.
Are there any common design patterns or alternative strategies for doing this ?
Example:
MainTable
ID INT NN PK
OtherTableCode VARCHAR NN
OtherTableID INT NN **cannot have a fk here **
SomeOtherTableA
ID INT NN PK
Col1
Col2
...
SomeOtherTableB
ID INT NN PK
Col1
Col2
...
MainTableRows
1 TabA 100
2 TabB 100
SomeOtherTableA
99 xxx xxx
100 xxx xxx
101 xxx xxx
SomeOtherTableB
99 xxx xxx
100 xxx xxx
101 xxx xxx

Viewing 15 posts - 1 through 15 (of 18 total)

You must be logged in to reply to this topic. Login to reply

Orlando Colamatteo SSC Guru Points: 182293 More actions · Answer 1

What you've described wanting to do is to assign two meanings to one column...not good. I will assume that TableA represents an entity, independent of the entity in TableB, otherwise they would be merged into one table. Is that an accurate assumption? If they are truly independent, and the entity in MainTable has need to reference one or the other as a property of itself then they should be referenced by MainTable separately. If you have a business rule that says that only one or the other should be referenced by MainTable, but never both at the same time, then you can implement a table-level CHECK CONSTRAINT to ensure that either MainTable.TableAId or MainTable.TableBId remain NULL at all times, i.e. that not both are populated.

Without knowing more about your data or situation, it's tough to elaborate further, but suffice it to say that having one column (ManiTable.OtherTableID) potentially mean two different things is violating a normalization principle that I would strongly urge you away from doing.

A basic example of what I described with a check constraint:

USE tempdb

GO

IF EXISTS ( SELECT *

FROM sys.objects

WHERE object_id = OBJECT_ID(N'dbo.MainTable')

AND type IN (N'U') )

DROP TABLE dbo.MainTable

IF EXISTS ( SELECT *

FROM sys.objects

WHERE object_id = OBJECT_ID(N'dbo.TableA')

AND type IN (N'U') )

DROP TABLE dbo.TableA

IF EXISTS ( SELECT *

FROM sys.objects

WHERE object_id = OBJECT_ID(N'dbo.TableB')

AND type IN (N'U') )

DROP TABLE dbo.TableB

GO

CREATE TABLE dbo.TableA

(

TableAId INT IDENTITY(1, 1)

PRIMARY KEY

)

CREATE TABLE dbo.TableB

(

TableBId INT IDENTITY(1, 1)

PRIMARY KEY

)

CREATE TABLE dbo.MainTable

(

MainTableId INT IDENTITY(1, 1)

CONSTRAINT [pk_dbo.MainTable] PRIMARY KEY,

TableAId INT NULL

FOREIGN KEY REFERENCES dbo.TableA (TableAId),

TableBId INT NULL

FOREIGN KEY REFERENCES dbo.TableB (TableBId)

) ;

ALTER TABLE dbo.MainTable ADD CONSTRAINT [ck_] CHECK (TableAId IS NULL OR TableBId IS NULL) ;

GO

There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato

William Plourde SSCrazy Points: 2035 More actions · Answer 2

Yes, you are correct in everything you wrote. TableA and TableB are separate entities. Yes, this is a bad practice because it violates normalization rules. I would very much like to do something different but I'm having trouble figuring out an alternate strategy.

To give you a concrete example. Assume this is a part of a home grown auditing system. The MainTable would have an enforceable FK to a lookup table(Id, Name) that lists all the tables in the database. The multi-use column would then have the id of the 'other' tables' modified row.

Using a separate column for each possible table in the database would not work because we would have to add a column to every table in the database each time we added a table.

There *should* be a different way to do this but I'm having trouble figuring it out. Basically I want to implement a centralized auditing system or subset of tables to accomplish this.

Orlando Colamatteo SSC Guru Points: 182293 More actions · Answer 3

William Plourde (3/5/2012)
Yes, you are correct in everything you wrote. TableA and TableB are separate entities. Yes, this is a bad practice because it violates normalization rules. I would very much like to do something different but I'm having trouble figuring out an alternate strategy.

As long as you know 😀

Using a separate column for each possible table in the database would not work because we would have to add a column to every table in the database each time we added a table.

Why every table? Did you mean we would have to add a column to MainTable each time we added a new table to the database? The only thing I can see wrong with that is you exhausting the limit on the number of columns in a SQL Server table (1024) with the number of tables in your DB, a real concern in many databases. All but one of your columns will be NULL this model so the Sparse Columns feature may become helpful to you in implementing MainTable.

There *should* be a different way to do this but I'm having trouble figuring it out. Basically I want to implement a centralized auditing system or subset of tables to accomplish this.

If you want "centralized" you may need to part with the idea of maintaining referential integrity, i.e. say goodbye to FKs 🙂

If you do go centralized then might I recommend piggy-backing on the SQL Server metadata and storing the table's object_id from sys.tables instead of creating and maintaining your own lookup table.

There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato

JQAllen SSChasing Mays Points: 617 More actions · Answer 4

To follow up on the last post. Try the following, where the object_id column contains the object_id from sys.objects for the table being referenced.

IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[Audit].[MainTable]') AND type in (N'U'))

DROP TABLE [Audit].[MainTable]

GO

IF EXISTS (SELECT * FROM sys.schemas WHERE name = N'Audit')

DROP SCHEMA [Audit]

GO

CREATE SCHEMA [Audit] AUTHORIZATION [dbo]

GO

--SELECT * FROM MAINTABLE

CREATE TABLE Audit.MainTable(

MainTableId int IDENTITY(1,1) not null

CONSTRAINT [PK_MainTable] PRIMARY KEY,

object_id int NOT NULL,

Id int NOT NULL)

ALTER TABLE Audit.MainTable ADD CONSTRAINT [AK_MainTable] UNIQUE NONCLUSTERED

(object_id ASC, Id ASC)

I don't particularly like this either. Is your main table gong to record the key value for each (table, row) that is updated/inserted? If so, this would not account for tables that have composite primary keys.

[font="System"]Jay Quincy Allen, Managing Partner, Relational Models LLC[/font]

JQAllen SSChasing Mays Points: 617 More actions · Answer 5

You may want to take a look at Microsoft Project REAL.

http://technet.microsoft.com/en-us/library/cc966416.aspx

You could add something like an ExecutionId to every table in your database. When a row is inserted or updated, the ExecutionId value is an FK back to an Audit.Execution table. The Audit.Execution table could be your centralized point to record whatever it is you want to store as audit info.

[font="System"]Jay Quincy Allen, Managing Partner, Relational Models LLC[/font]

Michael L John One Orange Chip Points: 27079 More actions · Answer 6

To possibly go off track, when you say this is for the purposes of auditing, you do realize that maintaining this relationship defeats the purpose auditing?

For starters, the auditing system should be completly independent of the system being audited. By creating these relationships, this violates this principle.

From a practical standpoint, what happens when you need to archive the data being audited, and need to delete these rows? You can't. The relationships will prevent that.

There is far more to auditing than can be written in a forumn post, i suggest that you possibly try to do some more research before you embark on what sounds like a lot of work.

Michael L John
If you assassinate a DBA, would you pull a trigger?
To properly post on a forum:
http://www.sqlservercentral.com/articles/61537/

William Plourde SSCrazy Points: 2035 More actions · Answer 7

Thank you all for the information. FWIW this system already exists. Without going into detail we store every version of every row for our primary entities. We also use this data to export any changes back to vendor systems that we integrate with. Except for the unenforceable foreign key it actually works rather well, but I am still exploring the issue in hopes that sometime in the future I can improve it.

I have a gut feeling that I may be able to switch to enforceable keys if I flip some of my table relationships but like I've said earlier the exact solution is still a little fuzzy.

Also, this is healthcare data so it's not as crazy as it sounds to store multiple versions of a data row in a table. For example, if a person moves 10 times we would have all 10 addresses associated with the person and we would know which is the most recent.

Stephanie Giovannini SSCertifiable Points: 7422 More actions · Answer 8

William Plourde (3/5/2012)
Without going into detail we store every version of every row for our primary entities. We also use this data to export any changes back to vendor systems that we integrate with.

Since this data is used for integration, it's not an auditing system, though it may also serve that purpose. You are describing a generalized history table.

I've worked on systems that require history tables. I create one history table per main table. Depending on requirements, the history table may reference the main table via foreign key.

But, your application wants a single table with all the history in it. Create a view to represent "all history."

CREATE TABLE App_A

(

app_a_id INT IDENTITY PRIMARY KEY,

some_name VARCHAR(20)

)

CREATE TABLE App_A_History

(

app_a_id INT NOT NULL REFERENCES App_A(app_a_id),

updated_date DATETIME2 DEFAULT SYSDATETIME() NOT NULL,

some_name VARCHAR(20),

PRIMARY KEY (app_a_id, updated_date)

)

CREATE TABLE App_B

(

app_b_id INT IDENTITY PRIMARY KEY,

some_data NVARCHAR(40)

)

CREATE TABLE App_B_History

(

app_b_id INT NOT NULL REFERENCES App_B(app_b_id),

updated_date DATETIME2 DEFAULT SYSDATETIME() NOT NULL,

some_data NVARCHAR(40)

PRIMARY KEY (app_b_id, updated_date)

)

GO

CREATE VIEW All_History

AS

SELECT app_a_id AS Id, updated_date, 'A' AS table_code

FROM App_A_History

UNION ALL

SELECT app_b_id AS Id, updated_date, 'B' AS table_code

FROM App_B_History

This may not be practical in your existing system, but if you can conceptualize how it should work, you can slowly move in that direction.

JQAllen SSChasing Mays Points: 617 More actions · Answer 9

I would strongly advise "against" using history or audit tables for several reasons. You will encounter performance issues, as well as complicate your ability to query your data. What you are describing is a requirement for "temporal modeling".

http://www.adamasystems.com/temporaldata.html

TemporalData.com

BiTemporalData.com

[font="System"]Jay Quincy Allen, Managing Partner, Relational Models LLC[/font]

William Plourde SSCrazy Points: 2035 More actions · Answer 10

We also have one history table for each base table that we require versioned data on. Our base table(s) generally only have a few Ids with all or most of the data in the corresponding version table for each base table.

This gives us our versioned data. But we ALSO have an audit trail so that when a row changes we store an entry. This way we have a centralized list of data that has changed in the system that we can then use to determine what to send back to the connected systems for updating.

William Plourde SSCrazy Points: 2035 More actions · Answer 11

oops, sorry for the duplicate response. Pesky page 2 link. doh !

We do have a similar table arrangement. We have a history table for each base table. The base table and the history table for each logical entity are FK'd. We do have primary keys, foreign keys, covered indexes etc so our querying is a little cumbersome at times but generally performs very well.

This give us our versioned data over time.

We ALSO have an audit trail so we know which rows for which entities have to be exported to external systems. For example if a person moves. We create a new address version and we write a pointer to the audit system with the address version row id so we know what to send out of the system.

Orlando Colamatteo SSC Guru Points: 182293 More actions · Answer 12

You may be re-inventing the wheel a bit. Depending on how unique your situation is you may be able to leverage some built-in features of SQL Server. Have you looked into Change Data Capture and Change Tracking?

http://msdn.microsoft.com/en-us/library/cc280519.aspx

There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato

JQAllen SSChasing Mays Points: 617 More actions · Answer 13

Change Data Capture and Change Tracking are useful for auditing, but these features are not the same thing as 'temporal modeling'. Among other things, they have no support for valid time or temporal referential integrity. Also, you must use the change data capture functions, further complicating your stored procedure development. Temporal Data is a topic under serious review by the major RDBMS vendors, including Microsoft, Oracle and IBM. Future versions of their products may include some of the new SQL2011 standards for temporal data manipulation, but temporal relational modeling is by far the best solution for these types of requirements to date, both in terms of performance and ease of maintenance.

[font="System"]Jay Quincy Allen, Managing Partner, Relational Models LLC[/font]

Orlando Colamatteo SSC Guru Points: 182293 More actions · Answer 14

JQAllen (3/5/2012)
Change Data Capture and Change Tracking are useful for auditing, but these features are not the same thing as 'temporal modeling'. Among other things, they have no support for valid time or temporal referential integrity. Also, you must use the change data capture functions, further complicating your stored procedure development. Temporal Data is a topic under serious review by the major RDBMS vendors, including Microsoft, Oracle and IBM. Future versions of their products may include some of the new SQL2011 standards for temporal data manipulation, but temporal relational modeling is by far the best solution for these types of requirements to date, both in terms of performance and ease of maintenance.

All fair points. Maybe I missed it, but I haven't heard anything just yet that has told me for certain we're talking about a temporal requirement. I only heard that changes need to be captured and eventually pushed to external sources, nothing about interacting with a specific system state as it existed at a previous point in time.

edit: minor clarification on point in time

There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato