Implementing, loading and optimizing a Type 2 Slowly Changing Fact Table

Question

Implementing, loading and optimizing a Type 2 Slowly Changing Fact Table

Viewing 9 posts - 16 through 23 (of 23 total)

You must be logged in to reply to this topic. Login to reply

Herman van Midden SSChasing Mays Points: 643 More actions · Answer 1

Hi mb67!

No there is no PK violation because a new InvoiceID is generated for the previously deleted record. So we end up with two records, one with a negative validity period (ValidFrom > ValidTo) and one valid record, each with a different InvoiceID

Regards,

Herman

Herman van Midden SSChasing Mays Points: 643 More actions · Answer 2

Thanks to all you guys on the update on dimensional modelling.

Since I also attended Kimballs courses I am aware of the 'problem'. Maybe I should have called my solution a timestamped accumulated snapshot, but SCD's are much better known and its implementation using MS SQL is widely available.

Of course a Type 1 SCF does'nt make sense (thanks Bruce), so why define it as a Type 2 SCF? I think everybody understood the point of the story, and that was what I wanted to reach.

And since I am not the first one to use it, it even a topic on Kimballs forum http://forum.kimballgroup.com/t1588-slowly-changing-facts I decided to use this name not realizing it would start this discussion :Whistling:

So I borrowed some useful dimensional design principles to efficiently implement transaction history (thanks mb67).

I hope you like my solution, although its theoretically maybe not correct, even if Invoices are dimenions, I am mixing facts with dimensions, it solves a lot of practical problems for me.

Regards,

Herman

mb97 SSC Enthusiast Points: 114 More actions · Answer 3

Your PK is defined on InvoiceIDO, not InvoiceID.

ALTER TABLE SCF_Invoices ADD CONSTRAINT PK_SCF_Invoices PRIMARY KEY CLUSTERED

( ValidFrom ASC,

InvoiceIDO ASC

)

sneumersky SSCertifiable Points: 7667 More actions · Answer 4

Slowly changing fact is also a term used by David Marco (who I believe is/was affiliated with the Inmon side of things) in his book Building The Metadata Repository (Wiley, 2000); so it has definitely been around for quite some time. It's all a "I say TO-MAY-TOE, you say TO-MAH-TOE" thing 🙂

Bruce W Cassidy SSChampion Points: 12668 More actions · Answer 5

Herman van Midden (8/20/2013)
[...]I borrowed some useful dimensional design principles to efficiently implement transaction history.
I hope you like my solution, although its theoretically maybe not correct, even if Invoices are dimenions, I am mixing facts with dimensions, it solves a lot of practical problems for me.

That's what it's all about. I just try to avoid the "type-2" naming thing, even for dimensions.

Herman van Midden SSChasing Mays Points: 643 More actions · Answer 6

Hi mb97!

Yes, you are right. I mixed up some things and simply assumed I did it correct. So I had to do my homework again. The correct indexes are as follows:

The primary key:

ALTER TABLE SCF_Invoices ADD CONSTRAINT PK_SCF_Invoices PRIMARY KEY CLUSTERED

( ValidFrom ASC,

InvoiceID ASC

)

And the index to aid the MERGE statement

CREATE NONCLUSTERED INDEX [IX_IsCurrent_InvoiceIDO_InclData] ON [dbo].[SCF_Invoices]

( [InvoiceIDO] ASC

)

INCLUDE ([IsCurrent],[ValidFrom],[ValidTo],[InvoiceID],[CustomerID],[DueDate],[IsOpen],[BucketID],[OrigAmnt],[RemAmnt])

WHERE ([IsCurrent]=(1))

Thanks for checking my code! I will try to correct the article and sample code.

Regards,

Herman

mb97 SSC Enthusiast Points: 114 More actions · Answer 7

Indexing SCDs drives me nuts, that's why I was looking at it so carefully- I still haven't seen a good general solution for it, especially if you need to access non-current versions by comparing to a fact effective date. I sometimes think it might be better to index on the SCD end date instead of the start date, but that's not clean, either.

-Mike

sneumersky SSCertifiable Points: 7667 More actions · Answer 8

The Microsoft Data Warehouse Toolkit has a good general indexing strategy for type 2 SCDs. It provides an excellent starting point.

Herman van Midden SSChasing Mays Points: 643 More actions · Answer 9

sneumersky: I guess I should try to get a copy of the Microsoft Data Warehouse Toolkit. Thanks for the tip.

Can you give me some hints on the indexing strategy they suggest?

-Herman