Using SQL Server as a backup for critical business data provides an essential safety net against loss and enables business users to more easily connect that data with features like reporting, analytics, and more. Biml is an XML dialect that can be used to create Microsoft SQL Server BI objects, like SSIS packages. Pairing Custom SSIS Components with Biml empowers you to easily build SSIS packages with access to data from sources outside of standard SSIS connectivity, like Salesforce. Key benefits include:
- Built-in metadata discovery — Many custom SSIS components expose metadata just like working with SQL Server, even dynamically generating schema for schema-less data sources
- Dynamic SSIS task generation — Use code nuggets in Biml to build SSIS tasks by iterating over discovered metadata
- Read from and write to external sources — Native source and destination components make external data look just like a database
This article demonstrates how to use Biml with custom SSIS components to dynamically build SSIS tasks (one for each Salesforce entity) to replicate Salesforce data to a Microsoft SQL Server database. We step through the Biml file one section at a time but have included the complete Biml file at the end of the article. While this article uses CData SSIS Components, the principles for task generation apply to any custom SSIS component.
Getting Started
In order to use Biml in an SSIS Project in Visual Studio, install BimlExpress. Once you install BimlExpress, open Visual Studio, create a new Integration Services project, and add a new Biml file.
Building the Biml File
With Biml, you can write scripting to dynamically generate SSIS projects, packages, and tasks. To see the Biml file for an existing project (and gain insights on using Biml with any custom SSIS tasks in your project), simply create your tasks and then right-click the project and select Convert SSIS Packages to Biml.
C# Code
- Use directives
<#@ .. #>
to import necessary namespaces and the assembly for the CData SSIS Components for Salesforce.<#@ template language="C#" hostspecific="true"#> <#@ import namespace="System.Data"#> <#@ import namespace="System.IO"#> <#@ import namespace="System.Collections"#> <#@ import namespace="System.Data.CData.Salesforce"#> <#@ assembly name="C:\Program Files\CData\CData SSIS Components for Salesforce 2018\lib\CData.SSIS2017.Salesforce.dll"#>
- In a new control nugget
<# ... #>
, write code to retrieve the metadata for the external data source. When using Biml, it is often common practice to work with metadata stored in a database. In the case of CData components, you can simply write ADO.NET code to retrieve the metadata dynamically. First, create variables for values that will be used throughout the Biml script, including a connection string for Salesforce and structures to store the Salesforce metadata.var salesforceConnectionString = "User=username;Password=password;SecurityToken=Your_Security_Token;"; var replicationServer = "SERVER"; var replicationCatalog = "CATALOG"; var replicationUserID = "sqluser"; var replicationPassword = "sqlpassword"; List<string> allEntityNames = new List<string>(); Hashtable entitySchema = new Hashtable();
- In the same control nugget used to defined variables, use ADO.NET code to programmatically query the Salesforce entities (tables) and fields (columns).
using (SalesforceConnection connection = new SalesforceConnection(salesforceConnectionString)) { connection.Open(); var entities = connection.GetSchema("Tables").Rows; foreach (DataRow entity in entities) { allEntityNames.Add(entity["TABLE_NAME"].ToString()); } foreach (string entity in allEntityNames){ var columns = connection.GetSchema("Columns", new string [] {entity}).Rows; entitySchema.Add(entity,columns); } }
Class Nugget
In our Biml script to create the replication tasks, there are several places where repeated XML elements are created dynamically (mostly for columns in SSIS tasks). Instead of repeating the code, add a class nugget <#+ ... #>
and create a helper class with methods to consolidate repeated code (full code at the end of the article).
- Add public static variables to determine which type of XML element to create.
public static int OUTPUT_WITH_ERROR = 0; public static int EXTERNAL = 1; public static int OUTPUT = 2; public static int DATAOVERRIDE_COLUMN = 4;
- Add a public method to build a SQL statement for use in the ExecuteSQL task used to drop existing tables and create a new table for the replicated data.
// Dynamically builds a DROP TABLE and CREATE statement // for each entity (table) in Salesforce using the table name and metadata. public static string GetDeleteAndCreateStatement(string tableName, DataRowCollection columns) { ... }
- Add a public method to build the collection of column-based XML elements.
// Dynamically build various column-based XML elements // for each entity (table) in Salesforce based on the column // metadata and the parent element public static string GetColumnDefs(DataRowCollection columns, int columnType){ ... }
Biml Script
Now that you have the table metadata and a Helper class to reduce repeated code, write the Biml script to dynamically create your replication packages.
- Start by adding a
CustomSsisConnection
element for the custom SSIS tasks. Note that theObjectData
attribute must be XML encoded. A typical connecting string looks similar to the following (note the use of thesalesforceConnectionString
variable for the ConnectionString property:<SalesforceConnectionManager> <Property Name="ConnectionString"><#=salesforceConnectionString#></Property> </SalesforceConnectionManager>
After configuring the connection to the custom SSIS task, configure a connection to the replication database. The completed
Connections
element looks like the following (note the use of text nuggets<#= ... #>
to add variables for connection string values):<Connections> <CustomSsisConnection Name="CData Salesforce Connection Manager" CreationName = "CDATA_SALESFORCE" ObjectData = "<SalesforceConnectionManager> <Property Name="ConnectionString"> <#=salesforceConnectionString#></Property> </SalesforceConnectionManager>" /> <Connection Name="Destination" ConnectionString="Data Source=<#=replicationServer#>;User ID=<#=replicationUserID#>;Password=<#=replicationPassword#>;Initial Catalog=<#=replicationCatalog#>;Provider=SQLNCLI11.1;"/> </Connections>
- With the
Connections
element configured, you are ready to build our replication package. In the package, create anExecuteSQL
task and aDataflow
task for each table to be replicated. To build each set of tasks, use awhile
loop in a control nugget to iterate through the entity (table) names:int entityCounter = 0; while(entityCounter < allEntityNames.Count){ var tableName = allEntityNames[entityCounter].ToString(); DataRowCollection columns = ((DataRowCollection)entitySchema[tableName]);
- ExecuteSQL TaskIn the
ExecuteSQL
task, execute a SQL query to drop any existing tables that have the same name as our Salesforce entity (table) and create a new table based on the metadata discovered using the CData SSIS Component. To create the query dynamically, use theHelper.GetDeleteAndCreateStatement()
helper function. - Dataflow TaskWithin the
Dataflow
task use aCustomComponent
as the source component and anOleDbDestination
as the destination.CustomComponent
ElementTheCustomComponent
element uses the CData SSIS Source component to retrieve Salesforce data. Start by configuring the component to use with the CData component.<CustomComponent Name="CData Salesforce Source" ComponentTypeName="CData.SSIS.Salesforce.SalesforceSource" Version="18" ContactInfo="support@cdata.com" UsesDispositions="true"> ... </CustomComponent>
DataflowOverrides
andOutputPaths
ElementsThe next step after configuring the connection is to add
Columns
elements to theOutputPath
child element of theDataflowOverrides
element. To do so, call theHelper.GetColumnDefs()
helper function. Use the sameHelper
class to add columns to theOutputColumns
andExternalColumns
child elements of the variousOutputPaths
elements. The definitions created provide information about the input, output, and error information for the SSIS component.<DataflowOverrides> <OutputPath OutputPathName="CData Salesforce Source Output"> <Columns> <#=HelperClass.GetColumnDefs(columns,HelperClass.DATAOVERRIDE_COLUMN) #> </Columns> </OutputPath> </DataflowOverrides> ... <OutputPaths> <OutputPath Name="CData Salesforce Source Output"> <OutputColumns> <#=HelperClass.GetColumnDefs(columns,HelperClass.OUTPUT_WITH_ERROR) #> </OutputColumns> <ExternalColumns> <#=HelperClass.GetColumnDefs(columns,HelperClass.EXTERNAL) #> </ExternalColumns> </OutputPath> <OutputPath Name="CData Salesforce Source Error Output" IsErrorOutput="true"> <OutputColumns> <#=HelperClass.GetColumnDefs(columns,HelperClass.OUTPUT) # </OutputColumns> </OutputPath> </OutputPaths>
CustomProperties
ElementCustom components typically have their own custom configuration interface, with a series of required
CustomProperties
:<CustomProperties> <CustomProperty Name="SQLStatement" DataType="Null" UITypeEditor="Microsoft.DataTransformationServices.Controls.ModalMultilineStringEditor, Microsoft.DataTransformationServices.Controls, Version= 10.0.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91" SupportsExpression="true"></CustomProperty> <CustomProperty Name="AccessMode" DataType="Int32" TypeConverter="CData.SSIS.Salesforce.AccessModeToStringConverter">0</CustomProperty> <CustomProperty Name="TableOrView" DataType="String" UITypeEditor="Microsoft.DataTransformationServices.Controls.ModalMultilineStringEditor, Microsoft.DataTransformationServices.Controls, Version= 10.0.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91" SupportsExpression="true">[<#=tableName#>]</CustomProperty> <CustomProperty Name="ExecStoredProcedure" DataType="Boolean">false</CustomProperty> </CustomProperties>
Connections
ElementThe last element to add to the
CustomComponent
element is aConnections
element, attaching the previously defined connection to the task:<Connections> <Connection Name="Salesforce 2018 Connection" ConnectionName="CData Salesforce Connection Manager" /> </Connections>
OleDbDestination
ElementThe final piece of the Dataflow task is theOleDbDestination
element. Attach the previously defined OleDbConnection to the element, set theInputPath
andExternalTableOutput
:<OleDbDestination Name="OLE DB Destination" ConnectionName="Destination" CheckConstraints="false"> <InputPath OutputPathName="CData Salesforce Source.CData Salesforce Source Output" /> <ExternalTableOutput Table="[<#=tableName#>]" /> </OleDbDestination>
- ExecuteSQL TaskIn the
- Use a control nugget to increment the counter used to iterate over the collection of entity (table) names. Do this within the
Tasks
element, after the end of theDataflow
element:... </Dataflow> <# entityCounter++;}#> </Tasks> </Package> </Packages> </Biml>
Build the SSIS Project
Once the Biml file is written, right-click on the Biml file in Server Explorer and select Generate SSIS Packages. At this point, Visual Studio and BimlExpress will translate the Biml file into SSIS package(s), ready to be run.
Run the package to begin replicating your Salesforce data to a SQL Server database (or any other destination you choose).
Complete Biml File
<#@ template language="C#" hostspecific="true"#> <#@ import namespace="System.Data"#> <#@ import namespace="System.IO"#> <#@ import namespace="System.Collections"#> <#@ import namespace="System.Data.CData.Salesforce"#> <#@ assembly name="C:\Program Files\CData\CData SSIS Components for Salesforce 2018\lib\CData.SSIS2017.Salesforce.dll"#> <# var salesforceConnectionString = ""User=username;Password=password;SecurityToken=Your_Security_Token;"; var replicationServer = "JDG"; var replicationCatalog = "BIML"; var replicationUserID = "sqltest"; var replicationPassword = "sqltest"; List<string> allEntityNames = new List<string>(); Hashtable entitySchema = new Hashtable(); using (SalesforceConnection connection = new SalesforceConnection(salesforceConnectionString)) { connection.Open(); var entities = connection.GetSchema("Tables").Rows; foreach (DataRow entity in entities) { allEntityNames.Add(entity["TABLE_NAME"].ToString()); } foreach (string entity in allEntityNames){ var columns = connection.GetSchema("Columns", new string [] {entity}).Rows; entitySchema.Add(entity,columns); } }#> <Biml xmlns="http://schemas.varigence.com/biml.xsd"> <Connections> <CustomSsisConnection Name="CData Salesforce Connection Manager" CreationName="CDATA_SALESFORCE" ObjectData="<SalesforceConnectionManager><Property Name="ConnectionString"><#=salesforceConnectionString#></Property></SalesforceConnectionManager>"/> <Connection Name="Destination" ConnectionString="Data Source=<#=replicationServer#>;User ID=<#=replicationUserID#>;Password=<#=replicationPassword#>;Initial Catalog=<#=replicationCatalog#>;Provider=SQLNCLI11.1;"/> </Connections> <Packages> <Package Name="Replicate Salesforce Package" Language="None" ConstraintMode="LinearOnCompletion" ProtectionLevel="EncryptSensitiveWithUserKey"> <Tasks> <# int entityCounter = 0; while(entityCounter < allEntityNames.Count){ var tableName = allEntityNames[entityCounter].ToString(); if (tableName.Equals("IdpEventLog")) break; DataRowCollection columns = ((DataRowCollection)entitySchema[tableName]);#> <ExecuteSQL Name="Create <#=tableName#> Replication Table" ConnectionName="Destination"> <DirectInput> <#=HelperClass.GetDeleteAndCreateStatement(tableName,columns)#> </DirectInput> </ExecuteSQL> <Dataflow Name="Replicate <#=tableName#>"> <Transformations> <CustomComponent Name="CData Salesforce Source" ComponentTypeName="CData.SSIS.Salesforce.SalesforceSource" Version="18" ContactInfo="support@cdata.com" UsesDispositions="true"> <DataflowOverrides> <OutputPath OutputPathName="CData Salesforce Source Output"> <Columns> <#=HelperClass.GetColumnDefs(columns,HelperClass.DATAOVERRIDE_COLUMN) #> </Columns> </OutputPath> </DataflowOverrides> <CustomProperties> <CustomProperty Name="SQLStatement" DataType="Null" UITypeEditor="Microsoft.DataTransformationServices.Controls.ModalMultilineStringEditor, Microsoft.DataTransformationServices.Controls, Version= 10.0.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91" SupportsExpression="true"></CustomProperty> <CustomProperty Name="AccessMode" DataType="Int32" TypeConverter="CData.SSIS.Salesforce.AccessModeToStringConverter">0</CustomProperty> <CustomProperty Name="TableOrView" DataType="String" UITypeEditor="Microsoft.DataTransformationServices.Controls.ModalMultilineStringEditor, Microsoft.DataTransformationServices.Controls, Version= 10.0.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91" SupportsExpression="true">[<#=tableName#>]</CustomProperty> <CustomProperty Name="ExecStoredProcedure" DataType="Boolean">false</CustomProperty> </CustomProperties> <OutputPaths> <OutputPath Name="CData Salesforce Source Output"> <OutputColumns> <#=HelperClass.GetColumnDefs(columns,HelperClass.OUTPUT_WITH_ERROR) #> </OutputColumns> <ExternalColumns> <#=HelperClass.GetColumnDefs(columns,HelperClass.EXTERNAL) #> </ExternalColumns> </OutputPath> <OutputPath Name="CData Salesforce Source Error Output" IsErrorOutput="true"> <OutputColumns> <#=HelperClass.GetColumnDefs(columns,HelperClass.OUTPUT) #> </OutputColumns> </OutputPath> </OutputPaths> <Connections> <Connection Name="Salesforce 2018 Connection" ConnectionName="CData Salesforce Connection Manager" /> </Connections> </CustomComponent> <OleDbDestination Name="OLE DB Destination" ConnectionName="Destination" CheckConstraints="false"> <InputPath OutputPathName="CData Salesforce Source.CData Salesforce Source Output" /> <ExternalTableOutput Table="[<#=tableName#>]" /> </OleDbDestination> </Transformations> </Dataflow> <# entityCounter++;}#> </Tasks> </Package> </Packages> </Biml> <#+ public static class HelperClass { public static int OUTPUT_WITH_ERROR = 0; public static int EXTERNAL = 1; public static int OUTPUT = 2; public static int DATAOVERRIDE_COLUMN = 4; public static string GetDeleteAndCreateStatement(string tableName, DataRowCollection columns) { var dropAndCreateStatement = "IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[{0}]') AND type IN (N'U'))\r\n" + "DROP TABLE [{0}];\r\n" + "CREATE TABLE [{0}]\r\n" + "(\r\n" + "{1}\r\n" + ")\r\n" + "ON \"default\";"; string columnDefs = ""; foreach (DataRow column in columns){ string columnDef = " [{0}] {1}"; string dataType = column["DATA_TYPE"].ToString(); if (dataType.ToLower().StartsWith("bool")) { dataType = "bit"; } else if (dataType.ToLower().Equals("real")) { dataType = "float"; } else if (dataType.ToLower().Contains("varchar")) { var columnLength = column["CHARACTER_MAXIMUM_LENGTH"]; dataType = "nvarchar(" + ((int)columnLength > 4000 ? "MAX" : columnLength) + ")"; } columnDefs += String.Format(columnDef,column["COLUMN_NAME"],dataType) + ",\r\n"; } columnDefs = columnDefs.Remove(columnDefs.LastIndexOf(",\r\n"),",\r\n".Length); return String.Format(dropAndCreateStatement,tableName,columnDefs); } public static string GetColumnDefs(DataRowCollection columns, int columnType){ var columnDefTemplate = ""; var columnElements = ""; if (columnType == DATAOVERRIDE_COLUMN) { columnDefTemplate = " <Column ErrorRowDisposition=\"FailComponent\" TruncationRowDisposition=\"FailComponent\" ColumnName=\"{0}\" />\r\n"; foreach(DataRow column in columns) { var columnName = column["COLUMN_NAME"]; columnElements += String.Format(columnDefTemplate,columnName); } return columnElements; } if (columnType == OUTPUT_WITH_ERROR) columnDefTemplate = " <OutputColumn Name=\"{0}\" {1} ExternalMetadataColumnName=\"{0}\" ErrorRowDisposition=\"FailComponent\" TruncationRowDisposition=\"FailComponent\" />\r\n"; else if (columnType == EXTERNAL) columnDefTemplate = " <ExternalColumn Name=\"{0}\" {1} />\r\n"; else if (columnType == OUTPUT) columnDefTemplate = " <OutputColumn Name=\"{0}\" {1} />\r\n"; foreach(DataRow column in columns){ var columnName = column["COLUMN_NAME"]; var dataTypeRaw = column["DATA_TYPE"].ToString().ToLower(); var typeAndRelatedInfo = ""; if (dataTypeRaw.Equals("bool")) { typeAndRelatedInfo = "DataType=\"Boolean\""; } else if (dataTypeRaw.Equals("date")) { typeAndRelatedInfo = "DataType=\"Date\" SsisDataTypeOverride=\"DT_DBDATE\""; } else if (dataTypeRaw.Equals("datetime")) { typeAndRelatedInfo = "DataType=\"DateTime\""; } else if (dataTypeRaw.Equals("real")) { typeAndRelatedInfo = ((int)column["NumericPrecision"] > 0 ? "Precision=\"18\" " : " ") + ((int)column["NumericScale"] > 0 ? "Scale=\"15\" " : " ") + "DataType=\"Decimal\""; } else if (dataTypeRaw.Equals("varchar")) { var columnLength = column["CHARACTER_MAXIMUM_LENGTH"]; if ((int)columnLength > 4000) { typeAndRelatedInfo = "DataType=\"String\""; } else { typeAndRelatedInfo = "Length=\"" + columnLength + "\" DataType=\"String\" CodePage=\"1252\""; } } columnElements += String.Format(columnDefTemplate,columnName,typeAndRelatedInfo); } return columnElements; } } #>