February 8, 2008 at 6:02 pm
I was researching the performance of large XML queries and came across this article:
http://searchsqlserver.techtarget.com/tip/0,289483,sid87_gci1265579,00.html
It's a great article detailing the performance hit of using FOR XML AUTO with large result sets.
I was wondering if there is less of a performance hit using FOR XML EXPLICIT. As I understand it, the server process reprocesses the result to arrive at a schema when using XML AUTO. Has anyone tested this? If not, I'll test it out and post me results if anyone is interested.
Thanks,
Paul
February 8, 2008 at 6:55 pm
I looked into how long it would take to test it out and it was easier than I thought. If anyone cares, here is my test data code and query results (I blatantly copied the test data code from the article in my previous post):
USE perftest
GO
if exists ( select * from INFORMATION_SCHEMA.TABLES where table_name = 'Employees' )
drop table Employees
GO
CREATE TABLE Employees (
id INT IDENTITY(1,1) NOT NULL,
FirstName VARCHAR(100) NOT NULL,
LastName VARCHAR(100) NOT NULL,
Children SMALLINT NULL,
HireDate DATETIME NOT null,
BirthDate DATETIME NULL,
Comments VARCHAR(1000) null
)
GO
DECLARE @i INT,
@Name VARCHAR(1000),
@rand1 SMALLINT,
@rand2 TINYINT
--This is exactly 1000 characters long:
SET @name = 'Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Nam quis eros. Nunc dictum. Morbi vel ante. Vestibulum sed nunc ac sem tincidunt suscipit. Quisque mattis elementum ligula. Donec ac ante eu purus molestie dapibus. Morbi tincidunt, urna eget cursus hendrerit, urna arcu blandit velit, ut feugiat tellus mi eget est. Maecenas massa ligula, lacinia ac, luctus ac, scelerisque a, tortor. Duis nisi dui, sagittis sed, venenatis vel, viverra imperdiet, arcu. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae; Donec iaculis, ante id venenatis aliquet, quam lacus dictum urna, ac interdum quam nunc in sem. Nam orci. Fusce libero orci, elementum at, vehicula eu, fringilla eu, est. Sed arcu magna, tristique hendrerit, lacinia at, sagittis vitae, magna. Curabitur sollicitudin hendrerit sapien. Fusce in tortor.
Nunc imperdiet sem vel lacus. Sed sed nulla. Praesent venenatis, justo nec mattis convallis, sem dolor auctor sem, eget eleifend mauris lectus id sed.
'
SET @i = 1
WHILE @i < 100000
BEGIN
SELECT @rand1 = CONVERT(SMALLINT, SUBSTRING(CONVERT(VARCHAR(20),
CONVERT(DECIMAL(20,9),RAND(@i))),6,3))
SELECT @rand2 = CONVERT(TINYINT, SUBSTRING(CONVERT(varchar(20),
CONVERT(DECIMAL(20,9),RAND(@i*10))),7,2))
INSERT INTO [Employees] (
[FirstName],
[LastName],
[Children],
[HireDate],
[BirthDate],
[Comments]
) SELECT SUBSTRING(@Name, @rand1, @rand2),
REVERSE(SUBSTRING(@Name, @rand1, @rand2)), @rand2,
DATEADD(mm,@rand2*(-1),GETDATE()),
DATEADD(mm,@rand1*(-1),GETDATE()),
SUBSTRING(@Name, @rand2, @rand1)
SET @i = @i + 1
END
GO
DECLARE @beg DATETIME
SET @beg = GETDATE()
SELECT top 10000 * FROM Employees FOR XML AUTO
SELECT DATEDIFF(ms,@beg, GETDATE()) AS ExecutionTime
GO
DECLARE @beg DATETIME
SET @beg = GETDATE()
SELECT
1 AS Tag,
NULL AS Parent,
NULL as 'employees!1!',
NULL as 'employee!2!id',
NULL as 'employee!2!firstname',
NULL as 'employee!2!lastname',
NULL as 'employee!2!children',
NULL as 'employee!2!hiredate',
NULL as 'employee!2!birthdate',
NULL as 'employee!2!comments'
UNION ALL
SELECT TOP 10000
2 AS Tag,
1 AS Parent,
NULL,
[id],
[FirstName],
[LastName],
[Children],
[HireDate],
[BirthDate],
[Comments]
from Employees
FOR XML EXPLICIT
SELECT DATEDIFF(ms,@beg, GETDATE()) AS ExecutionTime
go
Based on 10,000 records the results were:
FOR XML AUTO: 1396
FOR XML EXPLICIT: 1336
Straight Query: 953
Based on 50,000 records the results were:
FOR XML AUTO: 7953
FOR XML EXPLICIT: 7016
Straight Query: 4390
This is over a fairly congested network, but the results were consistent. Looks like the network is not as much a bottleneck as the XML computations for large result sets. I was originally researching this to find out if returning results to a reporting services server over the network would be faster in XML format. Obviously, it will not be.
Paul Perrick
February 8, 2008 at 10:09 pm
If you want to leverage XML in reporting services, you should probably be looking at pre-processing whatever you had in mind, and storing it in XML data columns. the FOR XML is a lot closer conceptual to a SELECT INTO statement than a SELECT (since it is so radically changing the structure.).
data stored in XML columns performs fairly well, assuming you actually create XML indexes on the column. However, "fairly performant" isn't "as good as a straight SQL data table".
Finally - keep in mind that there's going to be a built-in penalty just due to the fact that you're storing so much more stuff (since you define the data structure INSIDE the data itself.)
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply