September 22, 2016 at 9:47 am
Hi all,
I have a quick question about data flow process best practice as follows:
I am pulling data from various data source using SQL code (SP & SSIS PKG) to populate a DW
(Dimension and Fact tables).
The way it is done is as follows:
I created a schema called "source" to hold incoming data from various data sources as is, another schema called "Stage" to hold the massaged data (after applying business rules) then a third schema called "DW" to hold dimension and fact tables in a final format for reporting.
So having 3 schemas in one database? is this the right thing to do or should I have a 3 separate database completely?
Thanks for your advise.
September 22, 2016 at 10:01 am
I would, at least, put the DW schema on its own database. That way you can backup just that data. I'm not sure how long do you keep the data on the other schemas, but I'm certain that you don't need to retain all history.
September 22, 2016 at 10:13 am
oneteabag (9/22/2016)
Hi all,I have a quick question about data flow process best practice as follows:
I am pulling data from various data source using SQL code (SP & SSIS PKG) to populate a DW
(Dimension and Fact tables).
The way it is done is as follows:
I created a schema called "source" to hold incoming data from various data sources as is, another schema called "Stage" to hold the massaged data (after applying business rules) then a third schema called "DW" to hold dimension and fact tables in a final format for reporting.
So having 3 schemas in one database? is this the right thing to do or should I have a 3 separate database completely?
Thanks for your advise.
If it were me, I'd create three databases. Then, if you should need to scale-up the solution at some stage in the future, you have an easy path (putting the DBs on separate instances).
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
September 22, 2016 at 10:50 am
So if I elect to create 3 different databases. The first DB which is holding source data, should I create different schemas inside this database referencing the different data sources?
September 22, 2016 at 10:55 am
oneteabag (9/22/2016)
So if I elect to create 3 different databases. The first DB which is holding source data, should I create different schemas inside this database referencing the different data sources?
I'm not sure that there is a best practice for this, but I would do that, yes.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
September 22, 2016 at 12:24 pm
So having 3 schemas in one database? is this the right thing to do or should I have a 3 separate database completely?
It depends.
I have seen this done both ways - landing, staging and DW tables all in one DB in different schemas and in different databases. There are pros/cons to each with respect to how easy it is to back stuff up, assign permissions, lock stuff down, performance, etc. I personally like having all this stuff in one DB with different schemas - maintaining 1 DB is easier than three: less files to maintain, less permissions to assign, easier to backup, etc. That's my personal preference.
-- Itzik Ben-Gan 2001
September 22, 2016 at 2:11 pm
I am more into having 3 different databases, I think it is a lot cleaner and easier to organize and maintain. I have 300+ table in one DB right now.
September 22, 2016 at 4:36 pm
Having 3 different databases has other advantages than those already mentioned. The BIG ones are that your "IMPORT" and "STAGING" databases likely don't need to be backed up because all of the data there can easily be recreated from import files and the data is, at best, transient in nature to begin with. That also means that both databases can be set to the SIMPLE recovery model and that means that you can also take advantage of high performance/low resource usage Minimal Logging methods especially when Trace Flag 610 is enabled.
Please see the following white paper for more on Minimal Logging and the use of Trace Flag 610. The rub here is that I have no clue if any of that works in SSIS because I normally do all of this type of work in T-SQL.
https://technet.microsoft.com/en-us/library/dd425070(v=sql.100).aspx
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply