Extensible Markup Language (XML) is a robust method of transferring data between systems, but it can be confusing and difficult to implement for the beginner. The SSIS "XML Source" data flow component makes it trivial to read XML data into an SSIS package, and is an excellent way for an XML newbie to hit the ground running.
Attached is the source code for a simple SSIS package that reads an XML file and sequentially displays individual XML records in a message-box before inserting each record into a SQL Server staging table. Once the XML data is safely stored in a staging table it becomes ordinary SQL Server data that can be processed in familiar ways.
Package Setup
- Add the attached XmlSourceDemo.dtsx file to an SSIS project as an existing package.
- Use the attached create_people_xml.sql file to create a database table named people_xml in SQL Server Management Studio.
- Edit the OLE_DB_Connection connection manager to point to the database containing the people_xml table. Give this connection read/write permissions.
- Copy the "people.xml" file to your hard drive.
- Double-click the "Read XML File" data flow task to enter the "Data Flow" tab.
- Double-click the "XML Source" component.
- Browse to the "people.xml" file in the "XML location" text box.
- Click the "Generate XSD" button and save the "people.xsd" file to the same subdirectory as the "people.xml" file.
- Browse to the "people.xsd" file in the "XSD location" text box.
- Save and run the package.
Package Overview
Below is a screenshot of the package prior to execution.
Note the SSIS package global variables defined in the left panel. These variables allow SSIS components to pass values and objects on to subsequent components.
Below is the XML file that the package will read.
Here is a screen shot of the package displaying a record from the XML file during execution. When the "OK" button on the message box is clicked, the record will be inserted into the staging table.
This is the content of the staging table after the package executes.
Double-click on the "Read XML File" data flow task to see the XML Source component, where the magic occurs.
Double-clicking on the "XML Source" component brings up the Connection Manager, where a connection to the XML file and can be configured, after which an XML Schema Definition (XSD) file can be generated and configured.
This is what the generated XSD file looks like.
Clicking on the "Columns" node automatically displays the XML fields defined in the XSD file generated and configured in the Connection Manager.
Close the XML Source Editor and double-click the Recordset Destination component to configure the recordset object variable property to the xmlRecords global variable.
Click on the "Input Columns" tab and place checkmarks in all the checkboxes to pass all the fields to the next component.
Click the 'OK' button to close the Recordset Destination editor, then click the 'Control Flow' tab to return to the SSIS package main screen. Double-click the 'Loop through and shred XML records' foreach loop container and click on the 'Collection' node.
Configure the 'ADO object source variable' to point to the xmlRecords global variable, then click on the 'Variable Mappings' node.
Configure the Variable and Index fields as shown above, click the 'OK' button to close the foreach loop container editor, double-click on the 'Display XML record' script task, then click on the 'Script' node.
Add the 'ReadOnlyVariables' as shown above. This will allow the foreach loop container to pass values to the script task. Next click on the 'Design Script' button to display the scipt task editor.
The script task reads the variable values into a string and displays the string in a message box. Close the script editor, click the 'OK' button on the script task editor, then double-click the 'Insert XML record into staging table' execute SQL task.
The SQL statement below will insert the XML record into the staging table using the paramaters represented by the question marks, as shown below.
The parameters of the SQL statement are mapped as shown below.
Conclusion
The purpose of this exercise was to offer an initiation into capturing XML data and inserting it into a SQL Server staging table. Remove or disable the script task that displays the XML record and you have a viable means of importing XML data.
Once you have XML data in your staging table, you can do anything you want with it.