Introduction
OPENXML provides an easy way to use an XML document as a data-source for your procedures. OPENXML allows the data in XML document to be treated just like the columns and rows of your database table i.e., xml tags as columns and the value as rows. Data can be inserted / updated very quickly and efficiently without multiple trips to the database.
For example, if 100 records are to inserted / updated, then the traditional SQL
method is using 100 insert / update statements. This means that 100 trips are
made to the database, which results in degradation of performance. Using XML, these 100 trips can be reduced to 1 trip. This increases the performance of your application.
In this article, we would look at the basic syntax of OpenXML and end with a
simple example.
Basic Syntax
OPENXML (idoc int [in],rowpattern nvarchar[in],[flags byte[in]]) [WITH
SchemaDeclaration | TableName)]
Arguments
Idoc is the document handle of
the internal representation of an XML document.
This handle is obtained by calling the system stored procedure
sp_xml_preparedocument, which is
discussed later.
Rowpattern is the XPath query used to identify the nodes to be processed as
rows.
Flags indicates the mapping between the XML data and the relational rowset.
(optional parameter)
0 - Defaults to attribute-centric mapping.
1 - Attribute-centric mapping. (Combined with XML_ELEMENTS)
2 - Element-centric mapping. (Combined with XML_ATTRIBUTES)
8 - Combined with XML_ATTRIBUTES or XML_ELEMENTS
SchemaDeclaration
is
the schema definition of the form:
ColName ColType [ColPattern |
MetaProperty][, ColName ColType [ColPattern | MetaProperty]...]
Eg: WITH (EMPLOYEENAME
VARCHAR(30), EMPLOYEESALARY
INTEGER)
TableName is
the table name that can be given, instead of Schema Declaration, if a
table exists.
The WITH clause
provides a table format using either SchemaDeclaration or specifying an
existing TableName. If the optional WITH clause is not specified, the
results are returned in an edge table format. Edge tables represent the
fine-grained XML document structure (e.g. element/attribute names, the document
hierarchy, the namespaces, PIs etc.) in a single table.
System Stored Procedures for OpenXML
SQL Server provides system stored procedures that are used in conjunction with OPENXML:
- sp_xml_preparedocument
- sp_xml_removedocument
To write queries using
OPENXML, you must first create an internal representation of the XML document by
calling sp_xml_preparedocument. It is similar to LoadXML() function
provided in System.XML namespace. The stored procedure returns a handle to the
internal representation of the XML document. This handle is then passed to
OPENXML, which provides tabular view of the document based on Xpath query.
The internal representation
of an XML document can be removed from memory by calling
sp_xml_removedocument system stored procedure.
About
sp_xml_preparedocument
Steps which occurs when this
procedure is executed:
- Reads the XML text provided as input.
- Parses the text using the XML parser.
- Provides the parsed document, which is in tree form containing various nodes (elements, attributes, text, comments, and so on) in the XML document.
- It returns a handle that can be used to access the newly created internal representation of the XML document. This handle is valid for until the connection is reset, or until the execution of sp_xml_removedocument.
Note: A parsed document
is stored in the internal cache of SQL Server 2000. The MSXML parser uses
one-eighth the total memory available for SQL Server. To avoid running out of
memory, run sp_xml_removedocument to free up the memory.
The syntax is:
sp_xml_preparedocument hdoc OUTPUT [,
xmltext] [, xpath_namespaces]
hdoc
is the handle to the newly created document.(Integer value)
[xmltext] is the original XML document. The default
value is NULL, in which case an internal representation of an empty XML document
is created.
[xpath_namespaces] Specifies the namespace
declarations that are used in row and column XPath expressions in OPENXML.
Returns:
0 (success) or >0 (failure)
About
sp_xml_removedocument
Removes the internal
representation of the XML document specified by the document handle.
The syntax is:
sp_xml_removedocument hdoc
hdoc
is the handle to the
newly created document.(Integer value)
Returns: 0
(success) or >0 (failure)
Basic structure of stored-procedure(using OPENXML)
CREATE PROCEDURE dbo.TestOpenXML ( @strXML VARCHAR(2000) ) AS DECLARE @XMLDocPointer INT EXEC sp_xml_preparedocument @XMLDocPointer OUTPUT, @strXML BEGIN TRANSACTION ------ Any DML can be used here -------(see examples) COMMIT EXEC sp_xml_removedocument @XMLDocPointer RETURN
Examples
Consider two tables: Employees & Departments.
Employees: EmployeeID, EmployeeName, EmployeeSalary, DeptID.
Departments: DeptID, DeptName.
(EmployeeID and DeptID are Identity fileds.)
The XML file will follow the structure.
(XML Type 1)
<RECORD>
<EMPLOYEE>
<EMPLOYEENAME>Test Name</EMPLOYEENAME>
<EMPLOYEESALARY>1000</EMPLOYEESALARY>
<DEPTID>10</DEPTID>
</EMPLOYEE>
.
.
</RECORD>
If 100 employees have been recruited, u can generate a XML
file having the following structure with 100 <EMPLOYEE> tags, the query should
be:
INSERT INTO Employees(EmployeeName, EmployeeSalary, DeptID)
SELECT EMPLOYEENAME, EMPLOYEESALARY, DEPTID
FROM OPENXML(@XMLDocPointer,'/RECORD/EMPLOYEE',2)
WITH (EMPLOYEENAME VARCHAR(30), EMPLOYEESALARY INTEGER, DEPTID INTEGER)
If 100 employees are transferred from Production Department to Stores Department, the query should be:
UPDATE Employees SET DeptID = xmlTable.DEPTID FROM OPENXML(@XMLDocPointer,'/RECORD/EMPLOYEE',2) WITH (EMPLOYEEID INTEGER, DEPTID INTEGER) XmlTable WHERE XmlTable.EMPLOYEEID = Employees.EmployeeID
Note: use of alias for XML document is required to avoid confusion.
If the XML file contains attributes, there is minor change to the syntax.
(XML Type 2)
<RECORD>
<EMPLOYEE DEPTID = ‘10’ NAME = ‘Test’ SALARY = ‘10000’/>
<EMPLOYEE DEPTID = ‘10’ NAME = ‘Test’ SALARY = ‘10000’/>
<EMPLOYEE DEPTID = ‘10’ NAME = ‘Test’ SALARY = ‘10000’/>
.
.
</RECORD>
INSERT INTO Employees(EmployeeName, EmployeeSalary, DeptID)
SELECT NAME, SALARY, DEPTID
FROM OPENXML(@XMLDocPointer,'/RECORD/EMPLOYEE',2)
WITH (NAME VARCHAR(30) ‘@NAME’,
SALARY INT ‘@SALARY’,
DEPTID INT '@DEPTID')
Another scenario for attributes (accessing the parent node)
(XML Type 3)
<RECORD>
<DEPT ID=’10’>
<EMPLOYEE NAME = ‘Test101’ SALARY = ‘10000’/>
<EMPLOYEE NAME = ‘Test102’ SALARY = ‘10000’/>
</DEPT>
<DEPT ID=’11’>
<EMPLOYEE NAME = ‘Test111’ SALARY = ‘10000’/>
<EMPLOYEE NAME = ‘Test112’ SALARY = ‘10000’/>
<EMPLOYEE NAME = ‘Test113’ SALARY = ‘10000’/>
<EMPLOYEE NAME = ‘Test114’ SALARY = ‘10000’/>
</DEPT>
.
.
</RECORD>
INSERT INTO Employees (EmployeeName, EmployeeSalary, DeptID) SELECT NAME, SALARY, DEPTID FROM OPENXML (@XMLDocPointer,'/RECORD/DEPT/EMPLOYEE',2) WITH (NAME VARCHAR(30) ‘@NAME’, SALARY INT ‘@SALARY’, DEPTID INT '../@ID')
Finally, your store procedure should look like (using XML Type 2):
CREATE PROCEDURE dbo.TestOpenXML
( @strXML VARCHAR(2000)
)
AS
DECLARE @XMLDocPointer INT
EXEC sp_xml_preparedocument @XMLDocPointer OUTPUT, @strXML
BEGIN TRANSACTION
INSERT INTO Employees(EmployeeName, EmployeeSalary, DeptID)
SELECT NAME, SALARY, DEPTID
FROM OPENXML(@XMLDocPointer,'/RECORD/EMPLOYEE',2)
WITH (NAME VARCHAR(30) ‘@NAME’, SALARY INT ‘@SALARY’, DEPTID INT '@DEPTID')
COMMIT
EXEC sp_xml_removedocument @XMLDocPointer
RETURN
Conclusion
With SQL Server 2000, there are many ways you can get XML representation of relational data. One simple way is using the FOR XML clause with SELECT statement. A reverse mechanism, which allows turning XML document into tabular, relational rowset format is provided by T-SQL keyword named OPENXML.
The steps for using OPENXML can be simplified as:
- Call sp_xml_preparedocument. (system stored procedure which loads XML document string into memory, parses it and returns a handle).
- Use XPath querries for extracting required information from the XML through the handle.
- Call sp_xml_removedocument. (system stored procedure which frees up the memory allocated to the internal representation of the XML document).
In this article, we studied the OPENXML syntax and options available with it, with simpe examples. Future articles we will discuss other XML features offered by SQL Server 2000.