April 26, 2013 at 2:13 pm
I'm looking for an easy way to either convert xml to csv. I've tried using the SSIS xml source package way and I'm getting all kinds of truncation errors.
Thanks in advance.
April 26, 2013 at 5:52 pm
Here is a generic example.
You could adjust it for your needs.
declare @x xml;
--== Grab some sample xml ==--
set @x = (select * from sysprocesses for xml auto,elements,type,binary base64);
--== Get a list of headers for the CSV ==--
-- The assumption has been made that we want each top level node as one row in the CSV
-- If this is not correct, modify the FROM clause accordingly
select
stuff(
convert(varchar(max),
nd.query('
for $n in *
return concat(",""",local-name($n),"""")
')
),1,1,''
) as data
from @x.nodes('(*)[1]') as x(nd)
--== The UNION ALL is optional, but it makes sense to me to have the headers with the data ==--
-- this does that
union all
--== Now list the data as CSV for each top level node ==--
-- same assumption as above for the header, adjust as required
select
stuff(
convert(varchar(max),
nd.query('
for $n in *
return
concat(",""",string($n),"""")
')
),1,1,''
) as data
from @x.nodes('*') as x(nd);
Which produces output like this:
"spid" ,"kpid" ,"blocked" ,"waittype" ,"waittime" ,"lastwaittype" ,"waitresource" ,"dbid" ,"uid" ,"cpu" ,"physical_io" ,"memusage" ,"login_time" ,"last_batch" ,"ecid" ,"open_tran" ,"status" ,"sid" ,"hostname" ,"program_name" ,"hostprocess" ,"cmd" ,"nt_domain" ,"nt_username" ,"net_address" ,"net_library" ,"loginame" ,"context_info" ,"sql_handle" ,"stmt_start" ,"stmt_end" ,"request_id"
"1" ,"2412" ,"0" ,"AIM=" ,"49" ,"LOGMGR_QUEUE " ," " ,"0" ,"1" ,"78" ,"0" ,"0" ,"2013-04-25T09:30:24.007" ,"2013-04-25T09:30:24.007" ,"0" ,"0" ,"background " ,"AQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=" ," " ," " ," " ,"LOG WRITER " ," " ," " ," " ," " ,"sa " ,"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=" ,"AAAAAAAAAAAAAAAAAAAAAAAAAAA=" ,"0" ,"0" ,"0"
"2" ,"3120" ,"0" ,"AKE=" ,"141055089" ,"KSOURCE_WAKEUP " ," " ,"1" ,"1" ,"0" ,"0" ,"5" ,"2013-04-25T09:30:24.027" ,"2013-04-25T09:30:24.027" ,"0" ,"0" ,"background " ,"AQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=" ," " ," " ," " ,"SIGNAL HANDLER " ," " ," " ," " ," " ,"sa " ,"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=" ,"AAAAAAAAAAAAAAAAAAAAAAAAAAA=" ,"0" ,"0" ,"0"
"3" ,"3124" ,"0" ,"Axw=" ,"25" ,"DIRTY_PAGE_POLL " ," " ,"0" ,"1" ,"46" ,"0" ,"0" ,"2013-04-25T09:30:24.030" ,"2013-04-25T09:30:24.030" ,"0" ,"0" ,"background " ,"AQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=" ," " ," " ," " ,"RECOVERY WRITER " ," " ," " ," " ," " ,"sa " ,"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=" ,"AAAAAAAAAAAAAAAAAAAAAAAAAAA=" ,"0" ,"0" ,"0"
"4" ,"2500" ,"0" ,"AIQ=" ,"3079" ,"REQUEST_FOR_DEADLOCK_SEARCH " ," " ,"0" ,"1" ,"15" ,"0" ,"0" ,"2013-04-25T09:30:24.030" ,"2013-04-25T09:30:24.030" ,"0" ,"0" ,"background " ,"AQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=" ," " ," " ," " ,"LOCK MONITOR " ," " ," " ," " ," " ,"sa " ,"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=" ,"AAAAAAAAAAAAAAAAAAAAAAAAAAA=" ,"0" ,"0" ,"0"
This is not production code, just something I banged together, there are known issues, such as it not handling double quotes or commas in the data, but it's kind of interesting to see how easy it can be.
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
April 26, 2013 at 8:07 pm
Here's an XML parsing procedure I use all the time. It shreds the XML into a table. Below that I modified the output of the second version to produce a CSV.
CREATE PROCEDURE dbo.ParseXML
@strXML AS XML
,@rootnode NVARCHAR(255)
AS
BEGIN
/*
EXEC dbo.ParseXML
'<AccountDetailsRsp AccountNum="1" AccountStatus="AccountStatus1">
<PlayerInfo PlayerID="1" FirstName="FirstName1" LastName="LastName1">
<AddressList>
<PlayerAddress>
<Address AddressType="primary" City="City1" State="State1" Zip="Zip1"/>
<FutureUse>Example Text1</FutureUse>
<Phone PhoneNumber="PhoneNumber1" PhoneType="Type1" />
<Phone PhoneNumber="PhoneNumber2" PhoneType="Type2" />
</PlayerAddress>
<PlayerAddress>
<Address AddressType="billing" City="City1" State="State1" Zip="Zip1"/>
<FutureUse>Example Text2</FutureUse>
<Phone PhoneNumber="PhoneNumber1" PhoneType="Type1" />
<Phone PhoneNumber="PhoneNumber2" PhoneType="Type2" />
</PlayerAddress>
</AddressList>
</PlayerInfo>
</AccountDetailsRsp>'
,'AccountDetailsRsp'
EXEC dbo.ParseXML
'<items>
<item id="0001" type="Donut">
<name>Cake</name>
<ppu>0.55</ppu>
<batter id="1001">Regular</batter>
<batter id="1002">Chocolate</batter>
<batter id="1003">Blueberry</batter>
<topping id="5001">None</topping>
<topping id="5002">Glazed</topping>
<topping id="5005">Sugar</topping>
<topping id="5006">Sprinkles</topping>
<topping id="5003">Chocolate</topping>
<topping id="5004">Maple</topping>
</item>
</items>'
,'items'
*/
SET NOCOUNT ON
DECLARE
@strText AS NVARCHAR(MAX)
,@idoc INT
,@id INT
,@parentid INT
IF OBJECT_ID('tempdb..#ChildList') IS NOT NULL
DROP TABLE #ChildList
CREATE TABLE #ChildList (
[RowNum] INT IDENTITY(1,1) NOT NULL,
[parentid] INT NULL,
[id] INT NULL,
PRIMARY KEY (RowNum),
UNIQUE (RowNum))
IF OBJECT_ID('tempdb..#NodeList') IS NOT NULL
DROP TABLE #NodeList
CREATE TABLE #NodeList (
[RowNum] INT NOT NULL,
[id] INT NULL,
[parentid] INT NULL,
[nodetype] INT NULL,
[localname] NVARCHAR(MAX) NULL,
[text] NVARCHAR(MAX) NULL,
PRIMARY KEY (RowNum),
UNIQUE (RowNum))
SET @id = 1
SET @parentid = NULL
/* Get rid of tabs and extra spaces */
SET @strText = CAST(@strXML AS NVARCHAR(MAX))
SET @strText =
REPLACE(
REPLACE(
REPLACE(
REPLACE(
@strText
,' ',' '+CHAR(7))
,CHAR(7)+' ','')
,CHAR(7),'')
,CHAR(9),' ')
SET @strXML = CONVERT(XML,@strText)
/* Validate the XML */
EXEC sp_xml_preparedocument @idoc OUTPUT, @strXML
/* Parse the XML data */
;WITH cte
AS (
SELECT
CAST(p1.parentid AS INT) AS parentid
,CAST(p1.id AS INT) AS id
FROM
OPENXML (@idoc,@rootnode,2) AS p1
UNION ALL
SELECT
CAST(p2.parentid AS INT) AS parentid
,CAST(p2.id AS INT) AS id
FROM
OPENXML (@idoc,@rootnode,2) AS p2
JOIN
cte
ON CAST(cte.id AS INT) = CAST(p2.ParentID AS INT)
WHERE
CAST(p2.parentid AS INT) = @parentid
)
INSERT INTO #ChildList
SELECT *
FROM cte
INSERT INTO #NodeList
SELECT
#ChildList.RowNum
,xmllist.id
,xmllist.parentid
,xmllist.nodetype
,xmllist.localname
,CAST(xmllist.[text] AS NVARCHAR(MAX)) AS [text]
FROM #ChildList
INNER JOIN
OPENXML (@idoc,@rootnode,2) AS xmllist
ON #ChildList.id = xmllist.id
WHERE
#ChildList.RowNum > 0
/* Display the results */
;WITH RecursiveNodes(RowNum,id,parentid,nodepath,localname,[text],nodetype)
AS (
SELECT
#NodeList.RowNum
,#NodeList.id
,#NodeList.parentid
,CAST('/' + REPLACE(REPLACE(REPLACE(REPLACE(#NodeList.localname,'&',''),'?',''),' ',''),'.','') AS NVARCHAR(255)) AS nodepath
,#NodeList.localname
,CAST(#NodeList.[text] AS NVARCHAR(MAX)) AS [text]
,0 AS nodetype
FROM #ChildList
INNER JOIN
#NodeList
ON #ChildList.id = #NodeList.id
WHERE
#NodeList.parentid IS NULL
AND #ChildList.RowNum > 0
AND #NodeList.RowNum > 0
UNION ALL
SELECT
n.RowNum
,n.id
,n.parentid
,CAST(r.nodepath + '/'+ REPLACE(REPLACE(REPLACE(REPLACE(n.localname,'&',''),'?',''),' ',''),'.','') AS NVARCHAR(255)) AS nodepath
,n.localname
,n.[text]
,n.nodetype
FROM #NodeList AS n
INNER JOIN
RecursiveNodes AS r
ON n.parentid = r.id
WHERE
n.RowNum > 0
AND r.RowNum > 0
AND n.parentid >= 0
)
SELECT
ROW_NUMBER() OVER (ORDER BY Result.RowNum) AS RowNum
,Result.id
,Result.parentid
,Result.nodepath
,Result.nodetype
,Result.nodename
,Result.property
,Result.value
,Result.nodecontents
FROM
(
SELECT
rn.RowNum
,rn.id
,rn.parentid
,rn.nodepath
,(CASE
WHEN rn.nodetype = 0 THEN 'Root'
WHEN rn.nodetype = 1 THEN 'Node'
WHEN rn.nodetype = 2 THEN 'Property'
ELSE 'Data'
END) AS nodetype
,(CASE
WHEN rn.nodetype = 0 THEN rn.localname
WHEN rn.nodetype = 1 THEN rn.localname
WHEN rn.nodetype = 2 THEN (SELECT TOP(1) localname FROM RecursiveNodes WHERE id = rn.parentid)
ELSE NULL
END) AS nodename
,(CASE
WHEN rn.nodetype = 2 THEN rn.localname
ELSE NULL
END) AS property
,(CASE
WHEN rn.nodetype = 2 THEN (SELECT TOP(1) [text] FROM RecursiveNodes WHERE parentid = rn.id)
ELSE NULL
END) AS value
,(CASE
WHEN rn.nodetype = 1 THEN (SELECT TOP(1) [text] FROM RecursiveNodes WHERE parentid = rn.id)
WHEN rn.nodetype = 2 THEN (SELECT TOP(1) [text] FROM RecursiveNodes WHERE parentid = rn.parentid and [text] is not null)
ELSE NULL
END) AS nodecontents
FROM
RecursiveNodes AS rn
WHERE
rn.localname <> '#text'
) AS Result
WHERE
Result.id >= 0
AND (Result.id = 0
OR property IS NOT NULL
OR value IS NOT NULL
OR nodecontents IS NOT NULL)
END
Same procedure except for the output which will produce a CSV table.
CREATE PROCEDURE dbo.ParseXMLtoCSV
@strXML AS XML
,@rootnode NVARCHAR(255)
AS
BEGIN
/*
EXEC dbo.ParseXMLtoCSV
'<AccountDetailsRsp AccountNum="1" AccountStatus="AccountStatus1">
<PlayerInfo PlayerID="1" FirstName="FirstName1" LastName="LastName1">
<AddressList>
<PlayerAddress>
<Address AddressType="primary" City="City1" State="State1" Zip="Zip1"/>
<FutureUse>Example Text1</FutureUse>
<Phone PhoneNumber="PhoneNumber1" PhoneType="Type1" />
<Phone PhoneNumber="PhoneNumber2" PhoneType="Type2" />
</PlayerAddress>
<PlayerAddress>
<Address AddressType="billing" City="City1" State="State1" Zip="Zip1"/>
<FutureUse>Example Text2</FutureUse>
<Phone PhoneNumber="PhoneNumber1" PhoneType="Type1" />
<Phone PhoneNumber="PhoneNumber2" PhoneType="Type2" />
</PlayerAddress>
</AddressList>
</PlayerInfo>
</AccountDetailsRsp>'
,'AccountDetailsRsp'
EXEC dbo.ParseXMLtoCSV
'<items>
<item id="0001" type="Donut">
<name>Cake</name>
<ppu>0.55</ppu>
<batter id="1001">Regular</batter>
<batter id="1002">Chocolate</batter>
<batter id="1003">Blueberry</batter>
<topping id="5001">None</topping>
<topping id="5002">Glazed</topping>
<topping id="5005">Sugar</topping>
<topping id="5006">Sprinkles</topping>
<topping id="5003">Chocolate</topping>
<topping id="5004">Maple</topping>
</item>
</items>'
,'items'
*/
SET NOCOUNT ON
DECLARE
@strText AS NVARCHAR(MAX)
,@idoc INT
,@id INT
,@parentid INT
IF OBJECT_ID('tempdb..#ChildList') IS NOT NULL
DROP TABLE #ChildList
CREATE TABLE #ChildList (
[RowNum] INT IDENTITY(1,1) NOT NULL,
[parentid] INT NULL,
[id] INT NULL,
PRIMARY KEY (RowNum),
UNIQUE (RowNum))
IF OBJECT_ID('tempdb..#NodeList') IS NOT NULL
DROP TABLE #NodeList
CREATE TABLE #NodeList (
[RowNum] INT NOT NULL,
[id] INT NULL,
[parentid] INT NULL,
[nodetype] INT NULL,
[localname] NVARCHAR(MAX) NULL,
[text] NVARCHAR(MAX) NULL,
PRIMARY KEY (RowNum),
UNIQUE (RowNum))
SET @id = 1
SET @parentid = NULL
/* Get rid of tabs and extra spaces */
SET @strText = CAST(@strXML AS NVARCHAR(MAX))
SET @strText =
REPLACE(
REPLACE(
REPLACE(
REPLACE(
@strText
,' ',' '+CHAR(7))
,CHAR(7)+' ','')
,CHAR(7),'')
,CHAR(9),' ')
SET @strXML = CONVERT(XML,@strText)
/* Validate the XML */
EXEC sp_xml_preparedocument @idoc OUTPUT, @strXML
/* Parse the XML data */
;WITH cte
AS (
SELECT
CAST(p1.parentid AS INT) AS parentid
,CAST(p1.id AS INT) AS id
FROM
OPENXML (@idoc,@rootnode,2) AS p1
UNION ALL
SELECT
CAST(p2.parentid AS INT) AS parentid
,CAST(p2.id AS INT) AS id
FROM
OPENXML (@idoc,@rootnode,2) AS p2
JOIN
cte
ON CAST(cte.id AS INT) = CAST(p2.ParentID AS INT)
WHERE
CAST(p2.parentid AS INT) = @parentid
)
INSERT INTO #ChildList
SELECT *
FROM cte
INSERT INTO #NodeList
SELECT
#ChildList.RowNum
,xmllist.id
,xmllist.parentid
,xmllist.nodetype
,xmllist.localname
,CAST(xmllist.[text] AS NVARCHAR(MAX)) AS [text]
FROM #ChildList
INNER JOIN
OPENXML (@idoc,@rootnode,2) AS xmllist
ON #ChildList.id = xmllist.id
WHERE
#ChildList.RowNum > 0
/* Display the results */
;WITH RecursiveNodes(RowNum,id,parentid,nodepath,localname,[text],nodetype)
AS (
SELECT
#NodeList.RowNum
,#NodeList.id
,#NodeList.parentid
,CAST('/' + REPLACE(REPLACE(REPLACE(REPLACE(#NodeList.localname,'&',''),'?',''),' ',''),'.','') AS NVARCHAR(255)) AS nodepath
,#NodeList.localname
,CAST(#NodeList.[text] AS NVARCHAR(MAX)) AS [text]
,0 AS nodetype
FROM #ChildList
INNER JOIN
#NodeList
ON #ChildList.id = #NodeList.id
WHERE
#NodeList.parentid IS NULL
AND #ChildList.RowNum > 0
AND #NodeList.RowNum > 0
UNION ALL
SELECT
n.RowNum
,n.id
,n.parentid
,CAST(r.nodepath + '/'+ REPLACE(REPLACE(REPLACE(REPLACE(n.localname,'&',''),'?',''),' ',''),'.','') AS NVARCHAR(255)) AS nodepath
,n.localname
,n.[text]
,n.nodetype
FROM #NodeList AS n
INNER JOIN
RecursiveNodes AS r
ON n.parentid = r.id
WHERE
n.RowNum > 0
AND r.RowNum > 0
AND n.parentid >= 0
)
SELECT
'"id","parentid","nodepath","nodetype","nodename","property","value","nodecontents"'
AS XMLtoCSV
UNION ALL
SELECT
STUFF(
(SELECT
',"'+CAST(ISNULL(Result.id,'') AS VARCHAR(50))
+'","'+CAST(ISNULL(Result.parentid,'') AS VARCHAR(50))
+'","'+CAST(ISNULL(Result.nodepath,'') AS VARCHAR(50))
+'","'+CAST(ISNULL(Result.nodetype,'') AS VARCHAR(50))
+'","'+CAST(ISNULL(Result.nodename,'') AS VARCHAR(50))
+'","'+CAST(ISNULL(Result.property,'') AS VARCHAR(50))
+'","'+CAST(ISNULL(Result.value,'') AS VARCHAR(50))
+'","'+CAST(ISNULL(Result.nodecontents,'') AS VARCHAR(50))
FOR XML PATH(''))
,1,1,'') AS XMLtoCSV
FROM
(
SELECT
rn.id
,rn.parentid
,rn.nodepath
,(CASE
WHEN rn.nodetype = 0 THEN 'Root'
WHEN rn.nodetype = 1 THEN 'Node'
WHEN rn.nodetype = 2 THEN 'Property'
ELSE 'Data'
END) AS nodetype
,(CASE
WHEN rn.nodetype = 0 THEN rn.localname
WHEN rn.nodetype = 1 THEN rn.localname
WHEN rn.nodetype = 2 THEN (SELECT TOP(1) localname FROM RecursiveNodes WHERE id = rn.parentid)
ELSE NULL
END) AS nodename
,(CASE
WHEN rn.nodetype = 2 THEN rn.localname
ELSE NULL
END) AS property
,(CASE
WHEN rn.nodetype = 2 THEN (SELECT TOP(1) [text] FROM RecursiveNodes WHERE parentid = rn.id)
ELSE NULL
END) AS value
,(CASE
WHEN rn.nodetype = 1 THEN (SELECT TOP(1) [text] FROM RecursiveNodes WHERE parentid = rn.id)
WHEN rn.nodetype = 2 THEN (SELECT TOP(1) [text] FROM RecursiveNodes WHERE parentid = rn.parentid and [text] is not null)
ELSE NULL
END) AS nodecontents
FROM
RecursiveNodes AS rn
WHERE
rn.localname <> '#text'
) AS Result
WHERE
Result.id >= 0
AND (Result.id = 0
OR property IS NOT NULL
OR value IS NOT NULL
OR nodecontents IS NOT NULL)
END
April 27, 2013 at 4:57 am
Thanks!
April 27, 2013 at 4:58 am
Thanks
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply