February 22, 2016 at 6:38 am
Hi,
Trying to get a structured XML extract from a dataset, some of the columns have the "&" sign within the text. I have been looking into this and have been told that the best way to to prefix the field with CDDATA -
ie. <![CDATA[<sender>John & Smith</sender>]]>
The data in question is;
<Rows>
<Row>
<RecordType>SA</RecordType>
<Customer>NOV001</Customer>
<Branch>ME</Branch>
<TrnYear>2014</TrnYear>
<TrnMonth>1</TrnMonth>
</NewBusinessFlag>
<ProductClass>IHSP</ProductClass>
<ProductClassDescription>INHOUSE SPONGES & PUFFS</ProductClassDescription>
</Row>
</Rows>
I am using the following SQLCMD Command to do this;
sqlcmd -S . -d server -E -s"|" -W -i ActualSales.sql | findstr /V /C:"-" /B > ActualSales.csv
I would like it to display as follows with the TEXT to be within CDDATA;
<Rows>
<Row>
<RecordType>SA</RecordType>
<Customer>NOV001</Customer>
<Branch>ME</Branch>
<TrnYear>2014</TrnYear>
<TrnMonth>1</TrnMonth>
</NewBusinessFlag>
<ProductClass>IHSP</ProductClass>
<![CDDATA[<ProductClassDescription>INHOUSE SPONGES & PUFFS</ProductClassDescription>]]>
</Row>
</Rows>
________________________________________________________________________________________________
Regards
Steve
SQL 2008 DBA/DBD - MCTS/MCITP
Please don't trust me, test the solutions I give you before using them.
February 22, 2016 at 7:11 am
I noticed a few things missing from your options when trying to run this locally:
sqlcmd -S ".\sql2012" -d "tempdb" -E -s"|" -w 65535 -y 0 -i "C:\@\ActualSales.sql" -h -1 | findstr /V /C:"-" /B > "C:\@\ActualSales.csv"
Added -w, -y and -h, removed -W.
In the end, what SQL Server is delivering with the & in the ProductClassDescription value is valid XML. Why the requirement to use a CDATA tag? The two are semantically equivalent to any XML parser worth its salt.
edit: fix ampersand for this forum
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
February 22, 2016 at 7:12 am
Steve i know you can easily get the data with the reserved characters escaped out wiht ampersand amp semicolon, would that be an alternative?
it depends on your data, but this escaped it nicely for me:
IF OBJECT_ID('tempdb.[dbo].[#TheData]') IS NOT NULL
DROP TABLE [dbo].[#TheData]
GO
CREATE TABLE [dbo].[#TheData] (
[ID] INT IDENTITY(1,1) NOT NULL,
[RecordType] VARCHAR(30) NULL,
[Customer] VARCHAR(30) NULL,
[Branch] VARCHAR(30) NULL,
[Transactiondate] DATE NULL,
[NewBusinessFlag] BIT NULL,
[ProductClass] VARCHAR(30) NULL,
[ProductClassDescription] VARCHAR(50) NULL)
INSERT INTO #TheData
SELECT 'SA','NOV001','ME','2014-01-01',0,'IHSP','INHOUSE SPONGES & PUFFS' UNION ALL
SELECT 'SA','NOV001','ME','2015-01-01',0,'JHSP','Bed, Bath & Beyond'
SELECT [RecordType],
[Customer],
[Branch],
YEAR([Transactiondate]) AS TrnYear,
MONTH([Transactiondate]) AS TrnMonth,[NewBusinessFlag],[ProductClass],[ProductClassDescription]
FROM #TheData FOR XML PATH,Elements
Lowell
February 22, 2016 at 8:16 am
Hi Orlando,
You mentioned that I have some options missing - however, the options that you have mentioned are for screen width ( -w, -y )
The extract using SQLCMD and the options I have selected work fine - except that I wanted to enclose the text fields within "CDDATA"
The file is being processed externally and that is what they have asked for.... I am just a developer !
Thanks
Steve
Orlando Colamatteo (2/22/2016)
I noticed a few things missing from your options when trying to run this locally:sqlcmd -S ".\sql2012" -d "tempdb" -E -s"|" -w 65535 -y 0 -i "C:\@\ActualSales.sql" -h -1 | findstr /V /C:"-" /B > "C:\@\ActualSales.csv"
Added -w, -y and -h, removed -W.
In the end, what SQL Server is delivering with the & in the ProductClassDescription value is valid XML. Why the requirement to use a CDATA tag? The two are semantically equivalent to any XML parser worth its salt.
edit: fix ampersand for this forum
________________________________________________________________________________________________
Regards
Steve
SQL 2008 DBA/DBD - MCTS/MCITP
Please don't trust me, test the solutions I give you before using them.
February 22, 2016 at 8:30 am
SteveEClarke (2/22/2016)
Hi Orlando,You mentioned that I have some options missing - however, the options that you have mentioned are for screen width ( -w, -y )
Without the -y the xml is truncated in the file, at least on my machine.
The extract using SQLCMD and the options I have selected work fine - except that I wanted to enclose the text fields within "CDDATA"
The file is being processed externally and that is what they have asked for.... I am just a developer !
Thanks
Steve
Unless you do some post-processing or write an XQuery to get the same XML out of the column you already have, except formatting explicitly to get the CDATA, then you have what you have from SQL Server with the escaped ampersand.
To me the potentially shortest path here is to go back to the data consumer. It's not as if what SQL Server is delivering is invalid XML. What they are asking for here amounts to formatting, not substance. Maybe make the case that what you can provide out of the box is valid XML. If they say they need the CDATA just quantify the extra effort and make it happen. You could use some basic string manipulation if the structure is predictable and stable (that's risky though, but quick) or you could write some XQuery with FOR XML EXPLICIT to shape the XML exactly how they want it although it might take some effort to learn those techniques and get it just right.
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
February 22, 2016 at 9:05 am
Thanks for your advise.
I will return back to them - I thought I had missed a trick somewhere - but glad I was covered.
Regards
Steve
________________________________________________________________________________________________
Regards
Steve
SQL 2008 DBA/DBD - MCTS/MCITP
Please don't trust me, test the solutions I give you before using them.
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply