January 11, 2018 at 2:35 am
Hi All
I'm looking for some advice on getting data into our STAGING Data Warehouse Database. At the moment I am using a slowly changing dimension to take data from LIVE to STAGING
and then another SCD from STAGING to Data Warehouse. This currently takes approximately 20 minutes.
Recently I have changed this to an SSIS Transfer Database Objects table copy job which now takes 3 minutes.
Does anyone have any recommendations on best practices for copying data from a LIVE environment to a STAGING environment.
The tables are pretty small so I don't think there is any need for incremental loads as opposed to a complete table copy.
Any advice would be great.
Thanks
Mattie
January 11, 2018 at 3:35 am
Not sure if I'm misunderstanding your terminology, but staging tables should be a copy of the source from which they come (ie. no transformations).
Once you have all of your new data in your staging tables, then you import the data into your SCD tables.
You only need to use SCD tables if you intend on storing any changes to dimensional attributes.
January 11, 2018 at 5:01 am
Paulo de Jesus - Thursday, January 11, 2018 3:35 AMNot sure if I'm misunderstanding your terminology, but staging tables should be a copy of the source from which they come (ie. no transformations).
Once you have all of your new data in your staging tables, then you import the data into your SCD tables.
You only need to use SCD tables if you intend on storing any changes to dimensional attributes.
This is not correct. Unmodified source data should be held in 'extract' tables. This data is then loaded into staging via a process which may include data cleansing, transformations and calculations.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
January 11, 2018 at 5:10 am
Are you referring to your dimTablename or fctTablename (dimensional model) designed tables as staging?
I have always considered the "Extract" tables as part of the staging layer. Yes transformations can then be done on the data but when it initially comes in, it is as close to the original source as possible. From staging the data goes into your actual dimensions and fact tables.
Edit:
I have read my response again and I can see how it can be misleading. The "no transformations" bit should make reference only to what you referred to as the "Extract" tables. Other tables in the staging layer can contain transformations etc.
@Mattie:
If there is no need to keep any history of changes or you just need to have the latest set of data as per the source system then a table copy should be fine.
January 11, 2018 at 9:31 am
Paulo de Jesus - Thursday, January 11, 2018 3:35 AMNot sure if I'm misunderstanding your terminology, but staging tables should be a copy of the source from which they come (ie. no transformations).
Once you have all of your new data in your staging tables, then you import the data into your SCD tables.
You only need to use SCD tables if you intend on storing any changes to dimensional attributes.
Hi Paulo
Yes they are indeed a copy of the source tables with no transformations but I was using an SCD to "replicate" the data.
I think from the response below I will performa table copy instead of SCDs.
Thanks
Mattie
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply