Performance Hit: XML Explicit vs. XML Auto

  • I was researching the performance of large XML queries and came across this article:

    http://searchsqlserver.techtarget.com/tip/0,289483,sid87_gci1265579,00.html

    It's a great article detailing the performance hit of using FOR XML AUTO with large result sets.

    I was wondering if there is less of a performance hit using FOR XML EXPLICIT. As I understand it, the server process reprocesses the result to arrive at a schema when using XML AUTO. Has anyone tested this? If not, I'll test it out and post me results if anyone is interested.

    Thanks,

    Paul

  • I looked into how long it would take to test it out and it was easier than I thought. If anyone cares, here is my test data code and query results (I blatantly copied the test data code from the article in my previous post):

    USE perftest

    GO

    if exists ( select * from INFORMATION_SCHEMA.TABLES where table_name = 'Employees' )

    drop table Employees

    GO

    CREATE TABLE Employees (

    id INT IDENTITY(1,1) NOT NULL,

    FirstName VARCHAR(100) NOT NULL,

    LastName VARCHAR(100) NOT NULL,

    Children SMALLINT NULL,

    HireDate DATETIME NOT null,

    BirthDate DATETIME NULL,

    Comments VARCHAR(1000) null

    )

    GO

    DECLARE @i INT,

    @Name VARCHAR(1000),

    @rand1 SMALLINT,

    @rand2 TINYINT

    --This is exactly 1000 characters long:

    SET @name = 'Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Nam quis eros. Nunc dictum. Morbi vel ante. Vestibulum sed nunc ac sem tincidunt suscipit. Quisque mattis elementum ligula. Donec ac ante eu purus molestie dapibus. Morbi tincidunt, urna eget cursus hendrerit, urna arcu blandit velit, ut feugiat tellus mi eget est. Maecenas massa ligula, lacinia ac, luctus ac, scelerisque a, tortor. Duis nisi dui, sagittis sed, venenatis vel, viverra imperdiet, arcu. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae; Donec iaculis, ante id venenatis aliquet, quam lacus dictum urna, ac interdum quam nunc in sem. Nam orci. Fusce libero orci, elementum at, vehicula eu, fringilla eu, est. Sed arcu magna, tristique hendrerit, lacinia at, sagittis vitae, magna. Curabitur sollicitudin hendrerit sapien. Fusce in tortor.

    Nunc imperdiet sem vel lacus. Sed sed nulla. Praesent venenatis, justo nec mattis convallis, sem dolor auctor sem, eget eleifend mauris lectus id sed.

    '

    SET @i = 1

    WHILE @i < 100000

    BEGIN

    SELECT @rand1 = CONVERT(SMALLINT, SUBSTRING(CONVERT(VARCHAR(20),

    CONVERT(DECIMAL(20,9),RAND(@i))),6,3))

    SELECT @rand2 = CONVERT(TINYINT, SUBSTRING(CONVERT(varchar(20),

    CONVERT(DECIMAL(20,9),RAND(@i*10))),7,2))

    INSERT INTO [Employees] (

    [FirstName],

    [LastName],

    [Children],

    [HireDate],

    [BirthDate],

    [Comments]

    ) SELECT SUBSTRING(@Name, @rand1, @rand2),

    REVERSE(SUBSTRING(@Name, @rand1, @rand2)), @rand2,

    DATEADD(mm,@rand2*(-1),GETDATE()),

    DATEADD(mm,@rand1*(-1),GETDATE()),

    SUBSTRING(@Name, @rand2, @rand1)

    SET @i = @i + 1

    END

    GO

    DECLARE @beg DATETIME

    SET @beg = GETDATE()

    SELECT top 10000 * FROM Employees FOR XML AUTO

    SELECT DATEDIFF(ms,@beg, GETDATE()) AS ExecutionTime

    GO

    DECLARE @beg DATETIME

    SET @beg = GETDATE()

    SELECT

    1 AS Tag,

    NULL AS Parent,

    NULL as 'employees!1!',

    NULL as 'employee!2!id',

    NULL as 'employee!2!firstname',

    NULL as 'employee!2!lastname',

    NULL as 'employee!2!children',

    NULL as 'employee!2!hiredate',

    NULL as 'employee!2!birthdate',

    NULL as 'employee!2!comments'

    UNION ALL

    SELECT TOP 10000

    2 AS Tag,

    1 AS Parent,

    NULL,

    [id],

    [FirstName],

    [LastName],

    [Children],

    [HireDate],

    [BirthDate],

    [Comments]

    from Employees

    FOR XML EXPLICIT

    SELECT DATEDIFF(ms,@beg, GETDATE()) AS ExecutionTime

    go

    Based on 10,000 records the results were:

    FOR XML AUTO: 1396

    FOR XML EXPLICIT: 1336

    Straight Query: 953

    Based on 50,000 records the results were:

    FOR XML AUTO: 7953

    FOR XML EXPLICIT: 7016

    Straight Query: 4390

    This is over a fairly congested network, but the results were consistent. Looks like the network is not as much a bottleneck as the XML computations for large result sets. I was originally researching this to find out if returning results to a reporting services server over the network would be faster in XML format. Obviously, it will not be.

    Paul Perrick

  • If you want to leverage XML in reporting services, you should probably be looking at pre-processing whatever you had in mind, and storing it in XML data columns. the FOR XML is a lot closer conceptual to a SELECT INTO statement than a SELECT (since it is so radically changing the structure.).

    data stored in XML columns performs fairly well, assuming you actually create XML indexes on the column. However, "fairly performant" isn't "as good as a straight SQL data table".

    Finally - keep in mind that there's going to be a built-in penalty just due to the fact that you're storing so much more stuff (since you define the data structure INSIDE the data itself.)

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply