Introduction
In a data warehousing/BI solution, many times we need to develop SSIS packages with a parent-child relationship. When we use parent-child relationship packages with a parent package variable configured in the child package for logging events, we get logging information in our expected location as well as some other location.
In this article,we will discuss why we get the logging information in some other location when we use a parent package variable configuration. We will also discuss one approach to get logging information in the expected location only.
Let's create an SSIS solution with 1 master package and 2 child packages. Using these packages we will insert a data into "Employee.xlsx" and "Client.xlsx" from the employee and client tables respectively.
Below is our master package.
In this package,we have used 2 execute package tasks, which will execute the Employee_Load.dtsx package and the Client_Load.dtsx package. If we look in the connection managers, we see three connections. Two connections are for the package connection and one is for capturing the log events.
Below you can see the variables we have used in our master package.
These variables are:
- DBConnectionString : Data Source=SG-IND-2691\SQL2008;Initial Catalog=TEST;Provider=SQLNCLI10.1;Integrated Security=SSPI;Auto Translate=False;
- ExcelFileDestination : C:\Projects\SSIS_Test_Project\SSIS_Test_Project\Excel Files\
- LogFolderLocation : C:\Projects\SSIS_Test_Project\SSIS_Test_Project\Log Folder\
- PackageLocation: C:\Projects\SSIS_Test_Project\SSIS_Test_Project\
All these variables are configured using XML files.
Now we will use the SSIS enabled logging feature to log all event entries when master package executes. This is how we have configured logging into master package:
If we look under the configuration column (last column) we have specified an SSIS_Log connection manger.
This SSIS_Log connection manger is configured using "LogFolderLocation" and we have appended the string "Master_Pkg_Log" with current date to the log folder at runtime. When this package will run, we will get a log file for this package under the location: C:\Projects\SSIS_Test_Project\SSIS_Test_Project\Log Folder\Master_Pkg_Log_20121018.
So far we have seen the master package. Now let's have a look on child package "Employee_Load.dtsx".
As we see, this is very simple package that is just inserting data from the Employee table into an Excel file. In this package,we have used 3 connection managers: one is for the database connection, one is for log data, and one is for the Excel destination.
Now lets have a look on the variables that we have used for this package and see how they are configured.
As we see,we have configured all variables of this package using a parent package variable. That means when this package executes, all these variables will have values from the parent package.
Now let's have a look at how we have configured logging for this package (the same way as the master package).
If we see under the configuration column (last column), we have specified the SSIS_Log connection manger. This SSIS_Log connection manger is configured using the LogFolderLocation variable and we have appended the string Employee_Pkg_Log with the current date to the log folder at runtime. When this package runs, we will get the log for this package under the location: C:\Projects\SSIS_Test_Project\SSIS_Test_Project\Log Folder\Employee_Pkg_Log_20121018.
In the same way, we will create the other child package, Client_Load.dtsx, which will insert data into the Client.xlsx file from the Client table.
All other things such as logging, variables, and configuration are same as Employee_Load.dtsx.
When we execute the master package,we get log files created under two location :
- 1st location : C:\Projects\SSIS_Test_Project\SSIS_Test_Project\Log Folder (expected)
- 2nd location : C:\Projects\SSIS_Test_Project (not expected)
You can see the results below.
1st location :
2nd Location :
Why this is happening? Why do we have log files under two different locations?
The log files under the 1st location are fine because this is where we want to have log files. That means we have two question here.
- Why do we have log files under the 2nd location (C:\Projects\SSIS_Test_Project/)
- If "Employee_Pkg_Log_20121018" and "Client_Pkg_Log_20121018" log files are under 2nd location then why we dont have "Master_Pkg_Log_20121018" log file under this 2nd location ?
The answer for these questions is the order of events fired in the SSIS package. When we use the parent package variable configuration, the events are fired in the following way.
- Logging starts
- Package is Validated
- Parent Package Varialbe Configurations are loaded
- Expression are applied
- Package Runs
In all other other cases, the configuration is the first event. As our master package does have the parent package variable configuration, the configuration event executes first in this package. Hence the SSIS_Log Connection manager gets the value for LogFolderLocation from the config file and therefore the log for the master package gets created in our dedicated log folder only i.e under 1st location : C:\Projects\SSIS_Test_Project\SSIS_Test_Project\Log Folder\ and not under 2nd Location
This explains question No.2. Now about question No.1.
When the child package Employee_Load.dtsx runs, as we have parent package variable configuration in this package, the logging event executes first and then the configuration event. Hence the SSIS_Log Connection manager does not get the value for LogFolderLocation. Below is the expression that we have used for SSIS_Log Connection manager.
@[User::LogFolderLocation] + "Employee_Pkg_Log_" + (DT_WSTR,4)YEAR(GETDATE()) + RIGHT("0"+(DT_WSTR, 2) MONTH(GETDATE()) ,2) + RIGHT("0"+(DT_WSTR, 2) DAY(GETDATE()) ,2)
In this case, the SSIS_Log Connection manager will have only the value for Employee_Pkg_Log_20121018 and therefore the log file with this name gets created under the solution, i.e. under the 2nd location.(C:\Projects\SSIS_Test_Project/)
Once the configuration event gets executed after the loging event, then it also creates the log file under the dedicated log folder, i.e under the 1st Location: C:\Projects\SSIS_Test_Project\SSIS_Test_Project\Log Folder
Hence we have log files for child pakages under two locations.
The approach to have log files under the expected location only is to use environment variables. We can create an environment variable and set its value to LogFolderLocation path, i.e. C:\Projects\SSIS_Test_Project\SSIS_Test_Project\Log Folder\. Then we can configure the LogFolderLocation variable in child packages using this environment variable. Now run the master package, we will have the log for our child packages in our expected folder only.