August 20, 2013 at 3:42 am
Hi mb67!
No there is no PK violation because a new InvoiceID is generated for the previously deleted record. So we end up with two records, one with a negative validity period (ValidFrom > ValidTo) and one valid record, each with a different InvoiceID
Regards,
Herman
August 20, 2013 at 4:25 am
Thanks to all you guys on the update on dimensional modelling.
Since I also attended Kimballs courses I am aware of the 'problem'. Maybe I should have called my solution a timestamped accumulated snapshot, but SCD's are much better known and its implementation using MS SQL is widely available.
Of course a Type 1 SCF does'nt make sense (thanks Bruce), so why define it as a Type 2 SCF? I think everybody understood the point of the story, and that was what I wanted to reach.
And since I am not the first one to use it, it even a topic on Kimballs forum http://forum.kimballgroup.com/t1588-slowly-changing-facts I decided to use this name not realizing it would start this discussion :Whistling:
So I borrowed some useful dimensional design principles to efficiently implement transaction history (thanks mb67).
I hope you like my solution, although its theoretically maybe not correct, even if Invoices are dimenions, I am mixing facts with dimensions, it solves a lot of practical problems for me.
Regards,
Herman
August 20, 2013 at 7:18 am
Your PK is defined on InvoiceIDO, not InvoiceID.
ALTER TABLE SCF_Invoices ADD CONSTRAINT PK_SCF_Invoices PRIMARY KEY CLUSTERED
( ValidFrom ASC,
InvoiceIDO ASC
)
August 20, 2013 at 8:05 am
Slowly changing fact is also a term used by David Marco (who I believe is/was affiliated with the Inmon side of things) in his book Building The Metadata Repository (Wiley, 2000); so it has definitely been around for quite some time. It's all a "I say TO-MAY-TOE, you say TO-MAH-TOE" thing 🙂
August 20, 2013 at 1:25 pm
Herman van Midden (8/20/2013)
[...]I borrowed some useful dimensional design principles to efficiently implement transaction history.I hope you like my solution, although its theoretically maybe not correct, even if Invoices are dimenions, I am mixing facts with dimensions, it solves a lot of practical problems for me.
That's what it's all about. I just try to avoid the "type-2" naming thing, even for dimensions.
August 21, 2013 at 2:27 am
Hi mb97!
Yes, you are right. I mixed up some things and simply assumed I did it correct. So I had to do my homework again. The correct indexes are as follows:
The primary key:
ALTER TABLE SCF_Invoices ADD CONSTRAINT PK_SCF_Invoices PRIMARY KEY CLUSTERED
( ValidFrom ASC,
InvoiceID ASC
)
And the index to aid the MERGE statement
CREATE NONCLUSTERED INDEX [IX_IsCurrent_InvoiceIDO_InclData] ON [dbo].[SCF_Invoices]
( [InvoiceIDO] ASC
)
INCLUDE ([IsCurrent],[ValidFrom],[ValidTo],[InvoiceID],[CustomerID],[DueDate],[IsOpen],[BucketID],[OrigAmnt],[RemAmnt])
WHERE ([IsCurrent]=(1))
Thanks for checking my code! I will try to correct the article and sample code.
Regards,
Herman
August 21, 2013 at 7:24 am
Indexing SCDs drives me nuts, that's why I was looking at it so carefully- I still haven't seen a good general solution for it, especially if you need to access non-current versions by comparing to a fact effective date. I sometimes think it might be better to index on the SCD end date instead of the start date, but that's not clean, either.
-Mike
August 21, 2013 at 3:32 pm
The Microsoft Data Warehouse Toolkit has a good general indexing strategy for type 2 SCDs. It provides an excellent starting point.
August 22, 2013 at 1:54 am
sneumersky: I guess I should try to get a copy of the Microsoft Data Warehouse Toolkit. Thanks for the tip.
Can you give me some hints on the indexing strategy they suggest?
-Herman
Viewing 9 posts - 16 through 23 (of 23 total)
You must be logged in to reply to this topic. Login to reply