August 18, 2013 at 10:58 pm
Comments posted to this topic are about the item Implementing, loading and optimizing a Type 2 Slowly Changing Fact Table
August 19, 2013 at 2:55 am
Firsly I got the following message when trying to create the table SCF_Invoices
Msg 102, Level 15, State 1, Line 2
Incorrect syntax near '('."
Which I then amended the script and remove the bracket by the InvoiceID
I ten ran the script again and got the below error message:
Msg 1776, Level 16, State 0, Line 1
There are no primary or candidate keys in the referenced table 'DI_Customer' that match the referencing column list in the foreign key 'FK_SCF_Invoices_Customer'.
Msg 1750, Level 16, State 0, Line 1
Could not create constraint. See previous errors.
August 19, 2013 at 3:19 am
Hi!
I am sorry but it seems I made an error in the code inserted in the article.
I forgot to define PRIMARY KEYS for the DI_Customer and DI_Bucket tables on the CustomerID and BucketID columns respectively.
The attached sample code is correct and has been tested. Please download the "SCF_Invoices corrected.zip" file and run the "Create SQL_Invoices.sql" script. A description can be found in the README file.
Regards,
Herman
August 19, 2013 at 3:38 am
Hi,
No problem, where do I find this attached zip file?
Thanks
August 19, 2013 at 4:10 am
You can find the zip file at the bottom of the article under Resources.
August 19, 2013 at 5:34 am
The Dimensional Modeling Methodology does not know Type 2 Slowly Changing Fact Table. The Dimensional Modeling Methodology does know Slowly Changing Dimensions.
August 19, 2013 at 5:46 am
Renato10: I know, but I am not the first to use this term when applying SCD logic on a fact table. Have a look at http://brentgreenwood.blogspot.com/2011/12/slowly-changing-facts.html for example.
The solution presented works well for me and my customer. I would be happy to learn if there are better solutions.
Regards,
Herman
August 19, 2013 at 6:03 am
"ALTER TABLE SCF_Invoices ADD CONSTRAINT PK_SCF_Invoices PRIMARY KEY CLUSTERED
( ValidFrom ASC,
InvoiceIDO ASC
);
Since data is added for consecutive days, this PRIMARY KEY also will ensure new records are always appended at the end of the table."
With such index construction the statement above is not necessary true. It has dependency from InvoiceIDO and can lead to the situations where record will not be at the end of the table. In this case you may have some performance penalties.
August 19, 2013 at 6:24 am
So in summary woould you say that load performance was improved by a large reduction of records in the target fact table, or by the addition of indexes?
August 19, 2013 at 9:23 am
Folks using the Kimball Methodology would refer to this concept as an accumulating snapshot fact table. For more information on accumulating snapshot fact tables see the following articles:
http://www.kimballgroup.com/2010/12/01/design-tip-130-accumulating-snapshots-for-complex-workflows/
August 19, 2013 at 10:23 am
Nick,
The main improvement was achieved by reducing the total number of records. In addition, the indexes improved load and query speed, while not substantially increasing storage speed.
In my first attempts, using an ordinary fact table, index maintenance noticeably decreased loading speed when I tried to optimize the process.
Regards,
Herman
August 19, 2013 at 12:50 pm
Nick,
The main improvement was definitely achieved by reducing the number of records.
But also the indexes have a considerable influence. With the correct index loading speed was reduced by half.
However, every new index requires index maintenance operation, which increases processing cost with 20% or so. So the amount of indexes should be kept as low as possible.
Regards,
Herman
August 19, 2013 at 2:17 pm
Since you don't delete from your Invoice "SCF" table, doesn't the PK give you a violation when you do reloads from the same day?
August 19, 2013 at 5:27 pm
Dimensional modelling was something invented by Ralph Kimball (although I'm sure Margy Ross had a great deal of input as well). They came up with the naming standards for different types of dimensions. On to be sure, there are also different types of fact tables (snapshot, transactional, aggregate/summary, etc), but I don't believe they ever referred to a "type-2 fact table". Interesting idea; although it basically seems like a variety of snapshot fact table, with different results over time.
Ralph Kimball has a nice intro to the three basic types of dimensions here: http://www.kimballgroup.com/2008/08/21/slowly-changing-dimensions/
Margy Ross has another article on the same site, which explores additional types of changing dimensions, usually by adding snow-flakes. As an aside, she downplays the numbering and instead names the different types.
A similar article introducing facts is here: http://www.kimballgroup.com/2008/11/05/fact-tables/
You can see that "type-2" doesn't really apply well to fact tables -- because there is no "type-1", etc. Instead, all you are really doing is creating what looks to me like a combination of transaction fact and snapshot fact. I'm not saying that's a bad idea -- in fact, I think I've done similar things myself in the past. Calling it a "type-2" though... well, you'll get all of the dimensional modellers popping on to tell you that it's a type-2 dimension, not fact. 😀
Oh, I ran into that same bug in SQL Server with merge output and relational integrity, and ended up solving it in much the same way too.
August 19, 2013 at 6:02 pm
In a way, it's all semantics. Whether SCF or SCD really depends on usage, nothing more. In the stuff that I work with, there is no question that invoices are dimensions- the invoices are attributes of what you might call higher level facts (without going into a deep explanation), and so invoices are like the middle layer of a snowflake design. That doesn't mean that is their function in your case, though. It seems like you're just borrowing usefully from dimensional design principles to efficiently implement transaction history- nothing wrong with that.
Viewing 15 posts - 1 through 15 (of 23 total)
You must be logged in to reply to this topic. Login to reply