I’m kicking off a new series of blog posts discussing the topic of parent-child architectures in SQL Server Integration Services. The links to the posts in this series are below.
I still remember the first SSIS package I ever deployed to a production environment. It was the mid-2000s, and I was working on a large data migration project which would take my then-employer, an acute care hospital, from an old UNIX-based system into a modern SQL Server-based OLTP back end. The entire solution, which pushed around a few hundred million rows of data, was completely contained in a single SSIS package. And this thing was HUGE. When I say huge, I mean that the package metadata alone was 5mb in size. I had a bunch of duplicate code in there, and when I opened or modified the package, it took sometimes a minute or more to go through the validation for the dozens of different tasks and data flows. In hindsight, I can admit that it was not a well-designed architecture.
Fast forward about a decade. Having learned some lessons – the hard way – about ETL architecture, I’ve relied on a completely different way of thinking. Rather than relying on a few, do-everything SSIS packages, I prefer to break out tasks into smaller units of work. In using more packages that each do just one thing, I’ve discovered that this architecture is:
- Easier to understand
- Simpler to debug
- Much easier to distribute tasks to multiple developers
- In some cases, better performing
As part of my role as an independent consultant, I also do quite a bit of training, and in those training sessions the topic of parent-child ETL architecture comes up quite often. How many packages should I have? Should we have lots of small SSIS packages, or fewer, larger packages? This is also a topic on which I find a lot of questions on SQL Server discussion forums as well.
To share my experience on this topic, I’m starting a new series of post discussing parent-child architectures in SSIS. As part of this series, I will cover:
- Overview of parent-child architecture in SSIS
- Parent-child architecture in SSIS 2005 and 2008
- Parent-child architecture in SSIS 2012
- Passing configuration values from parent to child package
- Passing values from child packages back to the parent
- Error handling and logging in parent-child structures
- Parent-child architectures in an ETL framework
I’m looking forward to writing this series over the next few months. As always I look forward to your feedback.