December 1, 2013 at 11:35 am
Since I'm new to this and data warehousing, I'm not sure what is the best way to do what I'm wanting. It just seems best and more impressive to move my different packages into one package.
What I'm doing is pulling information from various sources into a Data Mart. My Data Mart has a set of 6 dimensional tables and one fact table. Each data flow task is in its own package (currently).
So the way I'm thinking of organizing this is this:
1) In Control Flow, get two sequence containers. The first container will contain all the dimension tables' ETL processes or Data flows.
2) The second container would have the fact table.
I don't know if this is a good design or if I will run into problems, but logically it seems I should make sure my dimensions are loaded first and then my fact table. So why not put them all in the same package?
I know that the info doesn't have to be placed in the dimensions first, but i don't really know how to do that yet. I'm just familiar.
Thanks!
December 1, 2013 at 5:59 pm
I don't use SSIS but, speaking from an old dude's perspective, I'd likely make the loading of each table a separate package for reusability and then have a control package (dunno if it's possible... like I said, I don't use SSIS) that defined the order of execution. That could also allow parallel execution if you needed it and the system could tolerate it. Makes for a hellova performance boost when it works right.
--Jeff Moden
Change is inevitable... Change for the better is not.
December 1, 2013 at 6:21 pm
That really sounds like a good point. I'd still like to get some feedback from someone who uses SSIS.
But like you said from a re-usability standpoint, these dimension tables will be re-used in other data marts. It is probably not wise to presume we will load each dimension before loading all the different fact tables that use them.
December 1, 2013 at 6:53 pm
Jeff Moden (12/1/2013)
I don't use SSIS but, speaking from an old dude's perspective, I'd likely make the loading of each table a separate package for reusability and then have a control package (dunno if it's possible... like I said, I don't use SSIS) that defined the order of execution. That could also allow parallel execution if you needed it and the system could tolerate it. Makes for a hellova performance boost when it works right.
Yes, master and child packages are possible. There is an execute package task.
From the sound of things that would be my approach here, especially since the child packages are already built.
December 1, 2013 at 7:05 pm
Nevyn (12/1/2013)
Yes, master and child packages are possible. There is an execute package task.
Thanks for the confirmation on that, Nevyn.
--Jeff Moden
Change is inevitable... Change for the better is not.
December 2, 2013 at 1:37 am
Jeff Moden (12/1/2013)
I don't use SSIS but, speaking from an old dude's perspective, I'd likely make the loading of each table a separate package for reusability and then have a control package (dunno if it's possible... like I said, I don't use SSIS) that defined the order of execution. That could also allow parallel execution if you needed it and the system could tolerate it. Makes for a hellova performance boost when it works right.
Confirmed that this is completely feasible and sounds like a good idea to me.
Bear in mind that you'll have to tune the level of parallelism you introduce & keep an eye on locks to get the best solution.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
December 2, 2013 at 8:31 am
Phil Parkin (12/2/2013)
Jeff Moden (12/1/2013)
I don't use SSIS but, speaking from an old dude's perspective, I'd likely make the loading of each table a separate package for reusability and then have a control package (dunno if it's possible... like I said, I don't use SSIS) that defined the order of execution. That could also allow parallel execution if you needed it and the system could tolerate it. Makes for a hellova performance boost when it works right.Confirmed that this is completely feasible and sounds like a good idea to me.
Bear in mind that you'll have to tune the level of parallelism you introduce & keep an eye on locks to get the best solution.
When I've done similar in T-SQL, I generally load to a "staging" table that looks and smells exactly the same as the original. When it's done, I simply repoint a synonym and there's virtually no contention or locking issues. Of course, I've also had the luxury of having enough disk space to do such a thing (I don't believe that disk space is as cheap as some would think but I do press for it for such things).
--Jeff Moden
Change is inevitable... Change for the better is not.
December 4, 2013 at 8:18 am
Thanks guys! I really appreciate this feedback! I'll do some more studying on this. Greatly appreciated.
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply