Checkpoints in SSIS are a great tool and they definitely have their place. The downside to them is that they are not so good when you start talking about parallel processes or containers. What follows is a little bit of information and words of caution toward some situations where you may want to use this handy feature. Let’s begin with parallel processes.
A while back Devin Knight wrote up a blog on this subject and I just wanted to pass along some new information that I have found and expand upon what he said. Here is a link to that blog, definitely worth looking at.
Let’s take the situation pictured below where we have two parallel set of processes. The second task on the left will fail and we would think that one of a couple situations would happen. These are the things that I personally would think could happen.
- The entire package would stop running
- The entire right set of tasks would finish
- The currently running task on the right would finish and when restarted the next task would run
This is what the package looks like when it is running and our failure occurs. Notice that the right task still shows as running.
This still leaves option 1 open as if the whole package fails we would expect to see a snapshot of what was going on when the failure occurred. But then a few seconds later our data flow finishes and we see the package as pictured below.
This has therefore eliminated option number 1. The package continued to run on the right side. Notice also though that the second task on the right did NOT run, so option 2 is also not valid. So we are left with option number 3. Logically I would expect that when I restart this package that my fail task would run and now Succeed Again should run. But this blog would be kinda useless if that were the case now wouldn’t it.
Notice that our data flow is running a second time now. The fail task has been fixed and is running, as it should. But we will get duplicate data in our destination table now. This is a HUGE issue if you don’t plan for something like this. So just know that if you have tasks running in parallel if a task fails while another task is running AT THE SAME TIME OF THE FAILURE, the non-failing task will run again.
OK, let’s discuss sequence containers for a moment. There is an interesting issue that happens here, and we will do this with a couple less screen shots even thought I know you would love to see my glorious skills with the snipping tool in Windows. Let’s take the example that you have a set of tasks in a sequence container. Maybe there is a set of tasks that you want to rerun if there is a failure. For instance, there is a sequence container after a send mail task and a data flow. Inside this container you have an execute SQL task that will truncate a table and then load a table followed by a task to do some logging. So, if the data flow fails, for something that almost NEVER happens: bad data, when the package is restarted, using checkpoints, we want to truncate the table before reloading it. I would think, “Hey, let’s just toss all three of these tasks into a sequence container and just put the checkpoint on the container.” Well Brad, that is a lovely idea!
Enter the situation seen below. We run some tasks and there is a failure. Keep in mind the checkpoint is on the sequence container so that we can restart and run ALL three tasks over again to ensure we don’t get duplicate data. Here is what the package would look like at the failure:
Alrighty, it failed! That was lucky, it was exactly what I needed to see for this blog. It’s almost like I planned that. Just tossed this line in to see if you were still paying attention. Let’s restart the package and rerun the sequence container right…
BOOM! That did not work as I thought it would. So what’s the lesson here? Don’t use a checkpoint on a container. As you can imagine this doesn’t work real well on For Loops or For Each Loops. Consequently, if you put the checkpoint on the execute SQL task inside the sequence container (not a for loop or for each loop), the checkpoint will work correctly and restart at the SQL task. This is because the sequence container is simply a tool organization in this case.
To recap:
- Don’t expect checkpoints to work properly in parallel.
- If a parallel task is RUNNING when another task fails, the package will rerun the non-failing task in addition to the failed task.
- Don’t use checkpoints on containers if you want them to work correctly
- If you put a checkpoint on a sequence container the failed task will be skipped the next time you run the package.
- I think that covers everything