Problem
When we need to load data from Excel sheets to a SQL Server table, we typically use a data flow task. This task doesn’t have the flexibility that we often need and requires a strict orchestration of components to be able to work consistently. Transformations, derived columns, etc. are usually required. Mappings, sheet name, columns, and data types need to be static with little room for changes.
Solution
A simple C# script with a few read-only variables simplifies the load process and it makes it very easy to load a few columns or all columns, even a range of cells from any Excel file to a SQL Server table.
Prerequisite: You already created an SSIS package, and your Excel file has a header for columns (each column has a name).
We'll walk through the scenario and script below.
Step 1: Find the Excel file that needs to be loaded. Let’s say you have an Excel file named “D:\TestFiles\TestFile1.xlsx” with a sheet named “Client Extract”, here's the content of your file:
We want to load the data from the above sheet into a SQL Server table.
Step 2: Create a SQL destination table to store the data. We need to create a table to store the data from the Excel sheet. We are going to load data for the following columns only: [Processed Date], [ID], [Registered Name], [Address]. The column, [FileLoaded], is going to be used to store the actual filename. Note that we are skipping some spreadsheet columns. Let's name our SQL table "[dbo].[TI_ClientUpdates]". Here is the script to create the table:
CREATE TABLE [dbo].[TI_ClientUpdates]( [TI_ClientUpdates_ID] [int] IDENTITY(1,1) NOT NULL, [Processed Date] [varchar](max) NULL, [ID] [varchar](max) NULL, [Registered Name] [varchar](max) NULL, [Address] [varchar](max) NULL, [FileLoaded] [varchar](300) NULL CONSTRAINT [PK_ClientUpdates] PRIMARY KEY CLUSTERED ( [TI_ClientUpdates_ID] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY] ) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY] GO
Step 3: Add this task to your SSIS package:
The SCR-Load data from Excel sheet task is going to do all the work for you to transfer data from the Excel sheet to the table from step 2, [dbo].[TI_ClientUpdates].
Step 4: Create these variables in the SSIS package and map them to package parameters so their values can be changed without having to redeploy the package. You can also reference the package parameters directly (in that case you won’t need any variable). Each variable, type, value, expression, and description is given below.
Step 5: Add the variables to the task, SCR-Load data from Excel sheet. These are listed in the image below.
Step 6: Add this code to your script:
#region Namespaces using System; using System.Data; using Microsoft.SqlServer.Dts.Runtime; using System.Windows.Forms; using System.IO; using System.Data.OleDb; using System.Data.SqlClient; #endregion namespace ST_9421de9351b24aebb22fbbc255140eba { /// <summary> /// ScriptMain is the entry point class of the script. Do not change the name, attributes, /// or parent of this class. /// </summary> [Microsoft.SqlServer.Dts.Tasks.ScriptTask.SSISScriptTaskEntryPointAttribute] public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase { #region Help: Using Integration Services variables and parameters in a script /* To use a variable in this script, first ensure that the variable has been added to * either the list contained in the ReadOnlyVariables property or the list contained in * the ReadWriteVariables property of this script task, according to whether or not your * code needs to write to the variable. To add the variable, save this script, close this instance of * Visual Studio, and update the ReadOnlyVariables and * ReadWriteVariables properties in the Script Transformation Editor window. * To use a parameter in this script, follow the same steps. Parameters are always read-only. * * Example of reading from a variable: * DateTime startTime = (DateTime) Dts.Variables["System::StartTime"].Value; * * Example of writing to a variable: * Dts.Variables["User::myStringVariable"].Value = "new value"; * * Example of reading from a package parameter: * int batchId = (int) Dts.Variables["$Package::batchId"].Value; * * Example of reading from a project parameter: * int batchId = (int) Dts.Variables["$Project::batchId"].Value; * * Example of reading from a sensitive project parameter: * int batchId = (int) Dts.Variables["$Project::batchId"].GetSensitiveValue(); * */ #endregion #region Help: Firing Integration Services events from a script /* This script task can fire events for logging purposes. * * Example of firing an error event: * Dts.Events.FireError(18, "Process Values", "Bad value", "", 0); * * Example of firing an information event: * Dts.Events.FireInformation(3, "Process Values", "Processing has started", "", 0, ref fireAgain) * * Example of firing a warning event: * Dts.Events.FireWarning(14, "Process Values", "No values received for input", "", 0); * */ #endregion #region Help: Using Integration Services connection managers in a script /* Some types of connection managers can be used in this script task. See the topic * "Working with Connection Managers Programatically" for details. * * Example of using an ADO.Net connection manager: * object rawConnection = Dts.Connections["Sales DB"].AcquireConnection(Dts.Transaction); * SqlConnection myADONETConnection = (SqlConnection)rawConnection; * //Use the connection in some code here, then release the connection * Dts.Connections["Sales DB"].ReleaseConnection(rawConnection); * * Example of using a File connection manager * object rawConnection = Dts.Connections["Prices.zip"].AcquireConnection(Dts.Transaction); * string filePath = (string)rawConnection; * //Use the connection in some code here, then release the connection * Dts.Connections["Prices.zip"].ReleaseConnection(rawConnection); * */ #endregion /// <summary> /// This method is called when this script task executes in the control flow. /// Before returning from this method, set the value of Dts.TaskResult to indicate success or failure. /// To open Help, press F1. /// </summary> public void Main() { try { // TODO: Add your code here string FolderPath = Dts.Variables["User::FolderPath"].Value.ToString(); string TableName = Dts.Variables["User::TableName"].Value.ToString(); string SchemaName = Dts.Variables["User::SchemaName"].Value.ToString(); string StartingColumn = Dts.Variables["User::StartingColumn"].Value.ToString(); string EndingColumn = Dts.Variables["User::EndingColumn"].Value.ToString(); string StartReadingFromRow = Dts.Variables["User::StartReadingFromRow"].Value.ToString(); string TabName = Dts.Variables["User::TabName"].Value.ToString(); var directory = new DirectoryInfo(FolderPath); FileInfo[] files = directory.GetFiles(); //Declare and initilize variables string fileFullPath = ""; //Get one Book(Excel file at a time) foreach (FileInfo file in files) { fileFullPath = FolderPath + "\\" + file.Name; //Create Excel Connection string ConStr; string HDR; HDR = "YES"; ConStr = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + fileFullPath + ";Extended Properties=\"Excel 12.0;HDR=" + HDR + ";IMEX=0\""; OleDbConnection cnn = new OleDbConnection(ConStr); cnn.Open(); DataTable dtSheet = cnn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null); string sheetname; sheetname = ""; foreach (DataRow drSheet in dtSheet.Rows) { if (drSheet["TABLE_NAME"].ToString() == TabName) { sheetname = drSheet["TABLE_NAME"].ToString(); string strExcel = string.Format("select * from [{0}]", sheetname); OleDbDataAdapter adp = new OleDbDataAdapter(strExcel, cnn); DataTable dt = new DataTable(); adp.Fill(dt); cnn.Close(); //Prepare Header columns list so we can run against Database to get matching columns for a table. string ExcelHeaderColumn = ""; string SQLQueryToGetMatchingColumn = ""; for (int i = 0; i < dt.Columns.Count; i++) { if (i != dt.Columns.Count - 1) ExcelHeaderColumn += "'" + dt.Columns.ColumnName + "'" + ","; else ExcelHeaderColumn += "'" + dt.Columns.ColumnName + "'"; } SQLQueryToGetMatchingColumn = "select STUFF((Select ',['+Column_Name+']' from Information_schema.Columns where Table_Name='" + TableName + "' and Table_SChema='" + SchemaName + "'" + "and Column_Name in (" + @ExcelHeaderColumn + ") for xml path('')),1,1,'') AS ColumnList"; //MessageBox.Show(SQLQueryToGetMatchingColumn); //MessageBox.Show(ExcelHeaderColumn); //USE ADO.NET Connection SqlConnection myADONETConnection = new SqlConnection(); myADONETConnection = (SqlConnection)(Dts.Connections["ADO_PROD"].AcquireConnection(Dts.Transaction) as SqlConnection); //Get Matching Column List from SQL Server string SQLColumnList = ""; SqlCommand cmd = myADONETConnection.CreateCommand(); cmd.CommandText = SQLQueryToGetMatchingColumn; SQLColumnList = (string)cmd.ExecuteScalar(); //Load Data from DataTable to SQL Server Table. using (SqlBulkCopy BC = new SqlBulkCopy(myADONETConnection)) { BC.DestinationTableName = SchemaName + "." + TableName; foreach (var column in dt.Columns) if (SQLColumnList.Contains(column.ToString()) == true) BC.ColumnMappings.Add(column.ToString(), column.ToString()); BC.WriteToServer(dt); } cmd.CommandType = CommandType.Text; cmd.CommandText = @"UPDATE "+ SchemaName + "." + TableName+" SET FileLoaded=@param1"; cmd.Parameters.AddWithValue("@param1", fileFullPath); cmd.ExecuteNonQuery(); } } } Dts.TaskResult = (int)ScriptResults.Success; } catch { Dts.TaskResult = (int)ScriptResults.Failure; } } #region ScriptResults declaration /// <summary> /// This enum provides a convenient shorthand within the scope of this class for setting the /// result of the script. /// /// This code was generated automatically. /// </summary> enum ScriptResults { Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success, Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure }; #endregion } }
Please note the only columns that will be read from the Excel sheet are the ones where the same column name exists in the SQL table. If the Excel column does not exist in the SQL table, it’s going to be skipped. If you need to read a new column from the Excel file, then all you must do is add the same column name to the SQL table.
Here's the before and after table data when running the SSIS package:
Before:
We can see the SSIS package ran successfully:
After:
I used this blog as a starting point for my project: https://www.techbrothersit.com/2016/03/how-to-read-excel-sheet-data-after.html