November 16, 2018 at 9:34 am
Hi,
There mainly 2 patterns for incrementally loading data for facts.
1. Load new records to staging
2. Truncate any of these rows from Fact Sales
3. Insert records from staging
This is good as it saves having to update records that have changed.
The other pattern :
1. Load to staging
2. Do a MERGE into fact to upload New/Changed records
Of course it relies on being able to get new/updated rows from the source into staging.
I've mostly got experience of the latter. My question is when would pattern 1 be suitable over the merge?
November 16, 2018 at 10:02 am
The first pattern (truncate & load) is suitable in cases where you either have:
November 16, 2018 at 10:33 am
leehbi - Friday, November 16, 2018 9:34 AMHi,
There mainly 2 patterns for incrementally loading data for facts.
1. Load new records to staging
2. Truncate any of these rows from Fact Sales
3. Insert records from stagingThis is good as it saves having to update records that have changed.
The other pattern :
1. Load to staging
2. Do a MERGE into fact to upload New/Changed records
Of course it relies on being able to get new/updated rows from the source into staging.
I've mostly got experience of the latter. My question is when would pattern 1 be suitable over the merge?
A truncate and full load is useful if you need to re-initialise the data warehouse. Maybe some changes have been made that the merge process don't deal with or some columns have been added that won't be merged. On the data-warehouse I maintain we run a full-load (truncate most of the tables) on a Sunday when there is not much usage. All the other dates an incremental load is run.
November 19, 2018 at 2:04 am
Thanks for comments - partitioning is the way to go with snapshots.
November 19, 2018 at 7:21 am
leehbi - Monday, November 19, 2018 2:04 AMThanks for comments - partitioning is the way to go with snapshots.
Careful now... While partitioning is a great way to move temporal sections of data in and out of a table, it won't determine if rows have been updated. It will also (in many cases) actually slow down queries compared to a monolithic structure.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply