March 25, 2009 at 11:46 am
Hi folks,
Some context : We have an old system (Cobol), that we are rewriting to a .NET Application. During this revamp, we will touch the SalesSummary/InvoiceDetails tables, and we need to decide the format to use.
We already have the tables built in the database (for reporting), but they are in a denormalized manner, and I would like to change this.
For a quick example, let me illustrate the way InvoiceDetails is built:
Create table InvoiceDetails (
InvoiceDetailID int identity(1,1),
MfrCode varchar(5) not null,
CustomerNum varchar(5) not null,
Amount money)
(This is only a sample to understand my questioning)
As a young DBA who wants to "Change the world" as the Cobol developpers say, I was going to rewrite the tables to get them in a normalized fashion. Here's what I was planning to do:
Create table InvoiceDetails (
InvoiceDetailID int identity(1,1),
MfrID int not null,
CustomerID int not null,
Amount money)
(You get the point, I just want the ID to refer to the PK of the tables, and add foreign keys to the tables)
This, in a relational database, looks great, since we can change the name of the Manufacturer, and it will still be reflected in the InvoiceDetails Table, but I'm not teaching you anything there.
So, after a few hours of meeting with the (excuse the appelation) "Old" Developpers, :cool:, we came to the point when they said we needed to keep 7 years of history for the invoice, which is fine. But we need to keep the data INTACT. That means, if the MfrCode changes, and we need to reprint the invoice, we need to have the MfrCode that was entered 7 years back. So, it cannot be linked by the MfrID (Well, I do not know how to, if it's possible).
So, they proposed to change the table to fit this need :
Create table InvoiceDetails (
InvoiceDetailID int identity(1,1),
MfrID int not null,
MfrCode varchar(5) not null, --Copied from the Manufacturer table when the invoice is created
CustomerID int not null,
CustomerNum varchar(5) not null, --Copied from the Customer table when the invoice is created
Amount money)
I do not really like this format, since I think it can lead to some major issues of data "Corruption", or data querying. I think we will come to a time when we will not know where to query the "MfrCode" from, and maybe just get it from the InvoiceDetails table, when the "Most accurate" MfrCode is in the Manufacturer table.
I'd like to have some input as what to do, this is only a small representation, as we have to build the database for the whole system, and I'm asking myself what to do when the data needs to be saved "AS IS", for invoice reprinting.
If some Architect Gurus could give me a hint on that, I'd really appreciate.
Thanks in advance,
Cheers,
J-F
March 26, 2009 at 2:54 am
You could adopt your "old" developers approach but it's not what I'd do. To me, keeping the
history of your MfrCode in your Invoice table is wrong as you have identified. I'd create a
table to track the changes to MfrCode which would look something like:
MfrIdintnot null,
MfrCodevarchar(5)not null,
ValidFromdatetimenot null,
ValidTodatetimenot null
You would use VaidFrom and ValidTo to track the historical changes. Presumably your InvoiceDetails
table has a datetime column which could be used to obtain the MfrCode in use at the date of the Invoice
by joining to the new table on MfrId, MfrCode and where the InvoiceDate falls between ValidFrom and ValidTo.
Rows in the table might look like:
1 ABD 20090101 20090220
1 ACD 20090220 20090326
1 XYZ 20090326 99991231
I always use '99991231' in ValidTo to identify the current row and I'm assuming there is no time portion
in the Invoice date, i.e. an InvoiceDate, if stored in a datetime column, would look
like '20090326 00:00:00.000'. When a change is made, the current row, identified by
ValidTo = '99991231', is updated to the current date and a new row is inserted with ValidFrom
set to the current date and ValidTo set to '99991231'.
You would then query it along the lines of
SELECT......
FROM InvoiceDetailsI
INNER JOIN
MfrCodeM
ONM.MfrId= I.MfrId
WHEREM.ValidFrom>= I.InvoiceDate
ANDM.ValidTo< I.InvoiceDate
The only decision you have to make is how to stamp the records with the two datetime values
when a change is made. In other words, if I change a code today does it remain valid for today or
does it change as of the start of today. If it's the latter, which is probably more common,
then the above would apply. If it's the former then the rows would look something like
1 ABD 20090101 20090220
1 ACD 20090221 20090326
1 XYZ 20090327 99991231
and the WHERE clause would change to
SELECT......
FROMInvoiceDetailsI
INNER JOIN
MfrCodeHistoryM
ONM.MfrId= I.MfrId
WHEREM.ValidFrom>= I.InvoiceDate
ANDM.ValidTo<= I.InvoiceDate
You now haven't tied the history of your MfrCode to the history retention period of your
Invoices which is much more desirable.
Hope that gives you some ideas.
Mike
March 26, 2009 at 7:15 am
Thanks Mike, these are some really interesting thoughts for some new development. The thing is, we still have to keep the "old" data, that is denormalized in the tables, for 7 years of history purpose. So I will not be able to implement a table with a dateFrom and dateTo for now. It might be interesting to keep that in mind for new development that might need that kind of operation.
I'm starting to be a little confused there though, so many ways to do things, I had 3 "possible" solutions to get it working, and now comes a 4th one. I wish you would have glued to one of the concepts I had mentioned, even though I love to see other people's opinion on this!
Thanks for your input Mike,
Anyone else has another way to do things?
Cheers,
J-F
March 27, 2009 at 2:53 am
J-F,
You're welcome. I've obviously not entirely understood your problem. You should do as you say with InvoiceDetails table and put the MfrId and CustomerId in there. Then, as I've suggested, you can create a table to track changes in MfrCode and another table to track changes to CustomerNum (didn't mention that first time, sorry) which is an interesting attribute to have to track changes on. The two tables will have their respective Ids in them so that they can be linked to the relevant tables.
You can then migrate your existing data into the new tables and there you have it. Or at least that's as I see it. It all hinges on having an InvoiceDate so that you can migrate the data but I assume there must be one somewhere otherwise how do you know when an Invoice is 7 years old?
I have actually seen designs where someone followed your idea of putting the Id and the MfrCode and CustomerNum on the InvoiceDetails table and he also had implemened my history table approach as well. Seems a bit strange to me but he did it because he didn't want to join tables as he was concerned about performance. Thing was his tables weren't very big but it worked and he was happy.
Anway, apologies if I'm being stupid and not understanding the problem. Wouldn't be the first time!
Mike
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply