March 8, 2022 at 12:40 pm
Hi,
We have a requirement to disable Data Load Tasks at runtime, this is a large package and if one task fails, we want to be able to re-run by programmatically disabling all successful load tasks at runtime. This is all controlled by a config table, Hence following a successful load we will update the config table.
I've got a script to do this successfully see below.
public void Main()
{
// TODO: Add your code here
try
{
Microsoft.SqlServer.Dts.Runtime.Application app;
string PackageLocation = @"C:\Package1.dtsx" ;
Package Package = new Package();
app = new Microsoft.SqlServer.Dts.Runtime.Application() ;
Package.EnableConfigurations = true;
Package = app.LoadPackage(PackageLocation,null);
DataTable dt = new DataTable();
OleDbDataAdapter adapter = new OleDbDataAdapter();
adapter.Fill(dt, Dts.Variables["User::taskStatus"].Value);
Executable e = (Executable)Package;
foreach (DataRow row in dt.Rows)
{
string tasks = row[0].ToString();
RecurseExecutables(e,tasks);
}
}
catch(Exception ex)
{
MessageBox.Show(ex.ToString());
}
Dts.TaskResult = (int)ScriptResults.Success;
}
public static void RecurseExecutables(Executable e, string taskName)
{
//Console.WriteLine("Component : " + e.ToString());
//Console.WriteLine(e.GetType());
if (e is TaskHost)
{
TaskHost p = (TaskHost)e;
if (p.Name == taskName)
{
p.Properties["Disable"].SetValue(p, true);
//test task state.
MessageBox.Show(p.Name + " Disabled:= " +p.Properties["Disable"].GetValue(p).ToString());
}
}
if (e is Package)
{
Package p = (Package)e;
// MessageBox.Show(p.Name);
foreach (Executable pe in p.Executables)
{
RecurseExecutables(pe,taskName);
}
}
if (e is Sequence)
{
Sequence s = (Sequence)e;
// MessageBox.Show(s.Name);
foreach (Executable pe in s.Executables)
{
RecurseExecutables(pe,taskName);
}
}
if (e is ForLoop)
{
ForLoop l = (ForLoop)e;
//MessageBox.Show(l.Name);
foreach(Executable lp in l.Executables)
{
RecurseExecutables(lp,taskName);
}
}
}
My problem is that I have to reference the package as string from the file system "string PackageLocation = @"C:\Package1.dtsx" ";
I want to be able to reference this in the Integration Services Catalogue, Is this even possible.
Any Help , information would be much appreciated.
Regards Simon
March 8, 2022 at 1:07 pm
As the package is held in encrypted form in SSISDB, I expect this to be quite difficult to achieve.
I'm not quite understanding the sequence of events.
Does the code you have posted actually modify the design of the packages (in terms of what is enabled) before running them?
Is this code located in some sort of 'master' package? When is this package executed?
How is the Config table updated? I don't see references to it in what you have posted.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
March 8, 2022 at 1:31 pm
Hi Phil,
So under normal circumstances this wouldn't get run at all,
the package consists of 64 data loads Tasks (3 party data).
Following a successful load of each of these tasks, we do and update to a config table with the latest load ID, the Load ID of the current load. If there are no errors great, every table in config would have the latest load ID.
If there is an error, half way through, we don't want to have to clear down all the tables that were successfully loaded, and re-load the whole thing from scratch.
So prior to this load we have a SQLtask queries config for anything where the load ID =the Current Load ID (Successfully Loaded),
these TaskNames get loaded in to a resultSet Variable and passed into the script task.
we then set all the tasks in this result set to disabled prior to doing the load.
We've gone down this root as its loaded once a month and its extremely business critical data, If the load is an hour or couple of hours late, there could be repercussions.
So to answer your question, this is a script task in the current package, hence it modifies itself.
March 8, 2022 at 2:43 pm
My opinion, I wouldn't be modifying the SSIS package to enable and disable portions of it as things complete (success or failure). I would have multiple SSIS packages to handle each unit of work. That way if a package fails, you can just re-run that individual package. I would then set up SQL jobs to handle running the packages and have one "master" job that is used to fire off the other jobs. When any package run job completes, it writes it to a table to indicate the success or failure. The master job would read from that table to determine which ones were successful and which ones failed and re-run if appropriate.
You can still move data from multiple tables if it makes sense in the one package, but I HATE having unrelated operations happening in a single package.
There are a few advantages to this approach, but to name a few:
One - you don't need a fancy application or scripting parsing through your SSIS package to determine what to run; just some simple TSQL.
Two - it simplifies your SSIS packages. GIANT SSIS packages can be hard to maintain and support and can be difficult to troubleshoot and can be slow to test and verify they are working. In the future if you add a small table to the process that should take 10 seconds to complete, your testing cycle is going to be hours still as you would need to run the whole package. Having smaller, independant SSIS packages are easier to test, debug, and do data verification on.
Three - future proofing and documentation is easier. If you have a giant SSIS package and your app vendor decides to change 1 table into 2 or 2 tables into 1 (for example) or to even retire a table as it was used for a feature they removed, you have to update a giant ssis package and HOPE you caught everywhere it was used. You check the SSIS documentation you did (you do document your SSIS packages, right?) and find that table XYZ is used in that package, but trying to narrow down all of the places where it is used can be challenging. If you have small SSIS packages, they are easier to modify and have a lot less risk.
Four - Support. I know I mentioned support earlier, but if you hired me to help support your ETL and I see 100's or 1000's of objects in one SSIS package, I am going to take days, if not weeks, trying to decode the whole thing to determine all of the independent objects and break down what is happening in there. If you have instead 100 small SSIS packages, it takes minutes, or maybe an hour for a slightly more complex but still small one, to decode and debug what is going on if it fails. Plus, when a job fails and it is running 1 GIANT SSIS package, you either need to put in complex logic (like you did) to handle running only parts of the code OR you need to modify it to run a small chunk. With smaller packages you can have them run asynchronously if your data supports that. With one giant one, you can have them run asynchronously, but it can be a pain to maintain if the package is failing due to the asynchronous packages (deadlock for example). If a deadlock occurs and it is one giant package, you then need to modify the entire package to correct that problem. If a deadlock occurs with multiple small ones, you just need to re-order them.
Fifth - memory management. Smaller packages are a lot easier to control how much memory you expect they will need and as such you can control better the order things run to ensure you don't overload the server. Running one huge package, if you need to tune it for memory management, you need to modify the entire package. If you have smaller packages, you can simply change the order they are being run OR have some run and then wait for them to complete before starting the next without needing to re-deploy the SSIS package and thus faster code review and fewer (possibly no) source control steps required! I would MUCH rather review a TSQL script for changing the order some jobs run in than try to read through a complex SSIS package.
Lastly - related to the above but code review. If the SSIS package is small, I can probably sift through the XML changes in git to review what was changed and do a pretty good guess as to if the change description and the changes match up. If it is a complex SSIS package, I may need to load that up in visual studio to review what they claim they changed and verify it matches.
The TL;DR version is I HATE giant SSIS packages with a passion and would much rather have a lot of small ones that are easier to support, debug, test, retire, and re-run if needed.
The above is all just my opinion on what you should do.
As with all advice you find on a random internet forum - you shouldn't blindly follow it. Always test on a test server to see if there is negative side effects before making changes to live!
I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.
March 8, 2022 at 3:09 pm
I second everything Brian Gale has posted here. Excellent advice.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
March 8, 2022 at 4:26 pm
SSIS comes with something called CHECKPOINT. This allows a process to be restarted from the point of failure. I think you should be using this.
March 9, 2022 at 8:41 am
Jonathan,
Thanks for that, however these tasks are run in a for loop container,
so unfortunately it would work in this case.
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply