August 23, 2016 at 7:47 am
Hello All,
Datatype: XML
With FOR XML a large string is build from tablecontent.
The result is stored as a XML datatype in a table.
Now I want to do some manipulations on the resulting string.
So I convert the XML string to a VARCHAR(MAX) datatype.
But I do not get the 'normal' EOL symbols. What should I do to get the normal EOL symbols (char(13)+char(10)) in the resulting string?
The XML can be a very large 'string'.
Any handy solutions for the EOL symbols ?
Ben
August 23, 2016 at 11:22 am
Yes, use a different approach. Either shred the XML and make your adjustments on the shredded data or use .modify() to modify your XML.
Since you haven't provided sample data or expected results, it's hard to give anything specific.
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
August 23, 2016 at 11:41 am
With FOR XML a large string is build from tablecontent.
The result is stored as a XML datatype in a table.
What is the purpose of this?
If the primary purpose is to do something that involves string analysis/manipulation (or anything where performance is important) then this is not the way to go.
That said, as Drew mentioned, there's not much help to provide without some sample data.
-- Itzik Ben-Gan 2001
August 23, 2016 at 11:45 am
ben.brugman (8/23/2016)
Now I want to do some manipulations on the resulting string.So I convert the XML string to a VARCHAR(MAX) datatype.
Why?
If you want to shred it, why not use XPath on the XML value?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
August 23, 2016 at 12:32 pm
Thanks all,
This is done from my tablet and therefore I have no code of example handy.
XML
is not the purpose and not a target it is only the means to get to something.
So why FOR XML? The reason for this is that FOR XML concatenates strings faster than any other method I have tried. So it is only used for the speed of concatenation.
What I concatenate is only all fields of a table so that it forms a proper CSV string/file.
After which I write the CSV string to a file.
The rows are seperated by a char(13)/char(10), at least that is the target.
Fields are seperated with a semicolon.
And all fields are converted to strings and are qouted with a textqualifier.
With larger datasets, FOR XML works so much faster than the other concatenates I have tried.
I'll have to look up another thread on the forum which shows this.
Because the actual statement which creates the XML is 'generated', it can handle any table. I do not know if it can handly any size. Up to now I have only handled examples of up to 500 000 chars. About 6000 rows/lines of 20 fields, most of them empty.
I'll try to post the Generated code for creating the string for the data.
After the string with data is created (as an XML type), I add some strings in front which contain the headers and some extra information, like the generation data and some meta. So the end result is a properly formed CSV files with some extra lines added in the front.
If I am on the wrong track, please point me in the right direction.
Thanks allready for your time.
I'll post an example of generated code.
Ben
August 23, 2016 at 12:47 pm
To answer your question about how (not getting involved in why)
declare @x xml;
select @x=(
select ''+b
from
(
select 'hello;friend'+CHAR(13)+CHAR(10)
union all
select 'my;world'+CHAR(13)+CHAR(10)
)a(b)
for xml path(''),type);
select @x.value('(./text())[1]','nvarchar(max)')
the important things are to use ",type" in the FOR XML PATH... and to extract the ".value" of the xml, not just convert.
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
August 23, 2016 at 1:10 pm
mister.magoo (8/23/2016)
To answer your question about how (not getting involved in why)
Thanks mister magoo,
I can not confirm yet if this works, so I have to try that a bit later.
I'll try this and if this works will mark your answer as a solution.
But this might take a few days. (Sorry).
For the others, I'll come up with an example which was already on this forum.
Again thank you,
Ben
August 23, 2016 at 2:24 pm
mister.magoo (8/23/2016)
To answer your question about how (not getting involved in why)the important things are to use ",type" in the FOR XML PATH... and to extract the ".value" of the xml, not just convert.
In the script below is the before and the after solution.
In the first script the result is written as an XML into a table and then the result is used further on.
In mister magoo's script, the result is in a variable.
Mister magoo's script is more elegant. Thank you.
The code is generated (now edited as wel), the column_names are the names of the table_columns and a header and some meta data must be prepared to go in front of the string.
Maybe with this I am on the wrong track and there are better/easier/faster solutions, if there are better solutions please point them out to me.
FOR XML works realy fast, but I would prefere to do more simple concatenation.
Thanks for your help.
At the moment I am preparing a sample table with data that can be used. (takes a bit of time).
Ben
The data is private. It is data from a Libre bloodglucose meter.
I have not provided the actual data. I have a number of different 'datasets', for example also from my insulinepump.
---------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------
-- Version : before.
exec sp_drop CSV_teststring_TEMP -- drops the table if it exists.
select top 1
DeString
into CSV_teststring_TEMP
from libre_data p -- Table used a a dummy because a direct write into a table from a FOR XML is not allowed.
OUTER APPLY (
select (
select
'"'+replace(convert(varchar(100),[li72_Master_id] ),'"','""')+'";'+
'"'+replace(convert(varchar(100),[li72_ID] ),'"','""')+'";'+
'"'+replace(convert(varchar(100),[li72_Tijd] ),'"','""')+'";'+
'"'+replace(convert(varchar(100),[li72_Type vastlegging] ),'"','""')+'";'+
'"'+replace(convert(varchar(100),[li72_Historie glucose (mmol/L)] ),'"','""')+'";'+
'"'+replace(convert(varchar(100),[li72_Scan glucose (mmol/L)] ),'"','""')+'";'+
'"'+replace(convert(varchar(100),[li72_Niet-numeriek snelwerkende insuline] ),'"','""')+'";'+
'"'+replace(convert(varchar(100),[li72_Snelwerkende insuline (eenheden)] ),'"','""')+'";'+
'"'+replace(convert(varchar(100),[li72_Niet-numeriek voedsel] ),'"','""')+'";'+
'"'+replace(convert(varchar(100),[li72_Koolhydraten (gram)] ),'"','""')+'";'+
'"'+replace(convert(varchar(100),[li72_Niet-numeriek langwerkende insuline] ),'"','""')+'";'+
'"'+replace(convert(varchar(100),[li72_Langwerkende insuline (eenheden)] ),'"','""')+'";'+
'"'+replace(convert(varchar(100),[li72_Notities] ),'"','""')+'";'+
'"'+replace(convert(varchar(100),[li72_Strip glucose (mmol/L)] ),'"','""')+'";'+
'"'+replace(convert(varchar(100),[li72_Keton (mmol/L)] ),'"','""')+'";'+
'"'+replace(convert(varchar(100),[li72_Maaltijdinsuline (eenheden)] ),'"','""')+'";'+
'"'+replace(convert(varchar(100),[li72_Correctieinsuline (eenheden)] ),'"','""')+'";'+
'"'+replace(convert(varchar(100),[li72_Gebruikerswijziging insuline (eenheden)] ),'"','""')+'";'+
'"'+replace(convert(varchar(100),[li72_Vorige tijd] ),'"','""')+'";'+
'"'+replace(convert(varchar(100),[li72_Bijgewerkte tijd] ),'"','""')+'"'+char(13)+char(10)
from libre_data
for xml path(''), type) as destring
) xxxxx
select * from CSV_teststring_TEMP
declare @ps varchar(max)
select @ps = convert(varchar(max),destring) from CSV_teststring_TEMP
set @ps = master.dbo.REPLACE3(@ps,' ',char(13))
--
-- The procedure sp_print shows the complete result
--
exec sp_print @ps
---------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------
-- Version adapted to mister magoo
declare @x xml;
select @x=(
select
'"'+replace(convert(varchar(500),[li72_Master_id] ),'"','""')+'";'+
'"'+replace(convert(varchar(500),[li72_ID] ),'"','""')+'";'+
'"'+replace(convert(varchar(500),[li72_Tijd] ),'"','""')+'";'+
'"'+replace(convert(varchar(500),[li72_Type vastlegging] ),'"','""')+'";'+
'"'+replace(convert(varchar(500),[li72_Historie glucose (mmol/L)] ),'"','""')+'";'+
'"'+replace(convert(varchar(500),[li72_Scan glucose (mmol/L)] ),'"','""')+'";'+
'"'+replace(convert(varchar(500),[li72_Niet-numeriek snelwerkende insuline] ),'"','""')+'";'+
'"'+replace(convert(varchar(500),[li72_Snelwerkende insuline (eenheden)] ),'"','""')+'";'+
'"'+replace(convert(varchar(500),[li72_Niet-numeriek voedsel] ),'"','""')+'";'+
'"'+replace(convert(varchar(500),[li72_Koolhydraten (gram)] ),'"','""')+'";'+
'"'+replace(convert(varchar(500),[li72_Niet-numeriek langwerkende insuline] ),'"','""')+'";'+
'"'+replace(convert(varchar(500),[li72_Langwerkende insuline (eenheden)] ),'"','""')+'";'+
'"'+replace(convert(varchar(500),[li72_Notities] ),'"','""')+'";'+
'"'+replace(convert(varchar(500),[li72_Strip glucose (mmol/L)] ),'"','""')+'";'+
'"'+replace(convert(varchar(500),[li72_Keton (mmol/L)] ),'"','""')+'";'+
'"'+replace(convert(varchar(500),[li72_Maaltijdinsuline (eenheden)] ),'"','""')+'";'+
'"'+replace(convert(varchar(500),[li72_Correctieinsuline (eenheden)] ),'"','""')+'";'+
'"'+replace(convert(varchar(500),[li72_Gebruikerswijziging insuline (eenheden)] ),'"','""')+'";'+
'"'+replace(convert(varchar(500),[li72_Vorige tijd] ),'"','""')+'";'+
'"'+replace(convert(varchar(500),[li72_Bijgewerkte tijd] ),'"','""')+'"'+char(13)+char(10) from libre_data
for xml path(''), type);
Declare @ls varchar(MAX)
Select @ls = @x.value('(./text())[1]','nvarchar(max)')
--
-- The procedure sp_print shows the complete result
--
exec sp_print @ls
---------------------------------------------------------------------------------------------------------------------
August 23, 2016 at 3:06 pm
Here is some sample data.
This data is adapted from:
http://www.sqlservercentral.com/Forums/Topic1744128-391-1.aspx#bm1744270
2015-10-12 7:09:13 AM (This is shown at my location).
First part of the data generates 40 000 words.
The words are quoted with a double qoute.
Then every tenth word, is split into two words with a <CR><LF> in between.
Then there are two concatenation scripts.
The first with a FOR XML.This one has the speed I want to have.
The second with just concatenation. This one gives the result I want to have.
I would like a solution without the FOR XML, is there one ?
Thanks, (I hope I didn't make typo's)
Ben
-- Generate a Dictionary with a large number of words.
;
WITH
L0 AS(SELECT 0 AS c UNION ALL SELECT 0 UNION ALL SELECT 0 UNION ALL SELECT 0), -- 4
L1 AS(select 0 as x from L0 A, L0 B, L0 C, L0 D), -- 4 ^4 = 256
L2 AS(select 0 as x from L1 A, L1 B, L1 C, L1 D), -- (4 ^ 4) ^4 = 4 Giga
L9 AS(Select row_number() OVER(PARTITION BY 1 order by x ) as P from L2)
select top 40000 'word_'+convert(varchar(30),P) Word, P into Dictionary from L9
Update Dictionary set Word = '"'+word+'"'
Update Dictionary set Word = word+CHAR(13)+CHAR(10)+replace(word,'word','again') where P%10 = 0
select * from Dictionary
-- Selection with a FOR xml construction
declare @starttime datetime = getdate()
select Word+',' +' ' from Dictionary order by P asc FOR XML PATH('')
select convert(float,GETDATE()-@starttime)*24*60*60 --0.24333 seconds for 40 000
-- Selection with a concatenation construction
Set @starttime = getdate()
declare @Dict varchar(max)=''
select @Dict=@Dict +Word+',' +' ' from Dictionary order by P asc
select convert(float,GETDATE()-@starttime)*24*60*60 -- 81.1 seconds for 40 000
select @Dict
exec sp_print @dict -- sp_print shows the complete string.
drop table Dictionary
In this example the words are qouted in the table before the concatenation. This was more convenient, especially with creating a <CR><LF> between the words. In an actual script the fields are adjusted in line while concatenating.
August 24, 2016 at 8:49 am
ben.brugman (8/23/2016)
XMLis not the purpose and not a target it is only the means to get to something.
So why FOR XML? The reason for this is that FOR XML concatenates strings faster than any other method I have tried. So it is only used for the speed of concatenation.
...
If I am on the wrong track, please point me in the right direction.
I had asked why you were creating/storing this data as XML because it's expensive. What you're saying makes perfect sense - the FOR XML PATH method for concatenating strings is absolutely the fastest way of doing it.
If you're doing it the way I think you're doing it, however, I think there's room for improvement (and please correct me if I'm mistaken). When using the FOR XML PATH technique, you don't have to store the text as XML, you can store them as text. Consider the queries below:
DECLARE @table1 TABLE(txtID int, txt varchar(100));
DECLARE @XMLcol TABLE(txtID int, txt xml);
DECLARE @StringCol TABLE(txtID int, txt varchar(100));
INSERT @table1 VALUES (1,'xxx'),(1,'yyy'),(2,'zzz'),(2,'abc'),(2,'999');
-- How I think you're storing your text
INSERT @XMLcol
SELECT
txtID, csv =
(
SELECT txt+CHAR(13)+CHAR(10)
FROM @table1 tb
WHERE ta.txtID = tb.txtID
FOR XML PATH(''), TYPE
)
FROM @table1 ta
GROUP BY txtID;
-- How you should store your text
INSERT @StringCol
SELECT
txtID, csv =
(
SELECT txt+CHAR(13)+CHAR(10)
FROM @table1 tb
WHERE ta.txtID = tb.txtID
FOR XML PATH(''), TYPE
).value('./text()[1]','varchar(8000)')
FROM @table1 ta
GROUP BY txtID;
SELECT * FROM @XMLcol; -- stored as XML
SELECT * FROM @StringCol; -- stored as text
storing the text as text (varchar or nvarchar) is the way to go if you're not doing that.
Note this article by Wayne Sheffield about this XML concatenation technique, he does a good job explaining what's going on under the hood. Creating a comma-separated list (SQL Spackle)[/url]
-- Itzik Ben-Gan 2001
August 24, 2016 at 4:03 pm
mister.magoo (8/23/2016)
To answer your question about how (not getting involved in why)
declare @x xml;
select @x=(
select ''+b
from
(
select 'hello;friend'+CHAR(13)+CHAR(10)
union all
select 'my;world'+CHAR(13)+CHAR(10)
)a(b)
for xml path(''),type);
select @x.value('(./text())[1]','nvarchar(max)')
the important things are to use ",type" in the FOR XML PATH... and to extract the ".value" of the xml, not just convert.
Hi,
May I ask you - what is the importance of ",type"?
I removed from the script and it returned exactly the same result.
Probably it's required for more complicated cases?
_____________
Code for TallyGenerator
August 24, 2016 at 4:16 pm
Alan.B (8/24/2016)
..........storing the text as text (varchar or nvarchar) is the way to go if you're not doing that.
Note this article by Wayne Sheffield about this XML concatenation technique, he does a good job explaining what's going on under the hood. Creating a comma-separated list (SQL Spackle)[/url]
Great, yes super, should have asked the question before I started. Would have saved me some time.
But then I would have missed the fun in building the solution.I got to the solution, but these improvements makes it a lot better.
And thanks for confirming that FOR XML is the way to go for concatenation, even if it has nothing to do with XML. This keeps bothering me a bit, because it looks a bit like a work around solution.
The article I still have to 'digest' it, is exactly what I was looking for.
Great help, great solution,
Thanks,
Ben
August 24, 2016 at 5:36 pm
Sergiy (8/24/2016)
mister.magoo (8/23/2016)
To answer your question about how (not getting involved in why)
declare @x xml;
select @x=(
select ''+b
from
(
select 'hello;friend'+CHAR(13)+CHAR(10)
union all
select 'my;world'+CHAR(13)+CHAR(10)
)a(b)
for xml path(''),type);
select @x.value('(./text())[1]','nvarchar(max)')
the important things are to use ",type" in the FOR XML PATH... and to extract the ".value" of the xml, not just convert.
Hi,
May I ask you - what is the importance of ",type"?
I removed from the script and it returned exactly the same result.
Probably it's required for more complicated cases?
Hi Sergiy,
You know, I thought it was necessary for the correct handling of special characters, but you are right - it works without, at least it does on 2016.
I'm pretty sure there are cases where it is needed, but maybe not this one.
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
August 24, 2016 at 7:53 pm
mister.magoo (8/24/2016)
Sergiy (8/24/2016)
mister.magoo (8/23/2016)
To answer your question about how (not getting involved in why)
declare @x xml;
select @x=(
select ''+b
from
(
select 'hello;friend'+CHAR(13)+CHAR(10)
union all
select 'my;world'+CHAR(13)+CHAR(10)
)a(b)
for xml path(''),type);
select @x.value('(./text())[1]','nvarchar(max)')
the important things are to use ",type" in the FOR XML PATH... and to extract the ".value" of the xml, not just convert.
Hi,
May I ask you - what is the importance of ",type"?
I removed from the script and it returned exactly the same result.
Probably it's required for more complicated cases?
Hi Sergiy,
You know, I thought it was necessary for the correct handling of special characters, but you are right - it works without, at least it does on 2016.
I'm pretty sure there are cases where it is needed, but maybe not this one.
The TYPE directive is required if you're embedding multiple FOR XML statements within each other. The inner ones need that to resolve correctly.
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
August 24, 2016 at 9:09 pm
Matt Miller (#4) (8/24/2016)
mister.magoo (8/24/2016)
Sergiy (8/24/2016)
mister.magoo (8/23/2016)
To answer your question about how (not getting involved in why)
declare @x xml;
select @x=(
select ''+b
from
(
select 'hello;friend'+CHAR(13)+CHAR(10)
union all
select 'my;world'+CHAR(13)+CHAR(10)
)a(b)
for xml path(''),type);
select @x.value('(./text())[1]','nvarchar(max)')
the important things are to use ",type" in the FOR XML PATH... and to extract the ".value" of the xml, not just convert.
Hi,
May I ask you - what is the importance of ",type"?
I removed from the script and it returned exactly the same result.
Probably it's required for more complicated cases?
Hi Sergiy,
You know, I thought it was necessary for the correct handling of special characters, but you are right - it works without, at least it does on 2016.
I'm pretty sure there are cases where it is needed, but maybe not this one.
The TYPE directive is required if you're embedding multiple FOR XML statements within each other. The inner ones need that to resolve correctly.
Just to expand on this (I began my explanation before Matt's reply). From BOL - TYPE Directive in FOR XML Queries:
SQL Server support for the xml (Transact-SQL) enables you to optionally request that the result of a FOR XML query be returned as xml data type by specifying the TYPE directive.
The VALUE method is what preserves the special XML characters. That's why this works:
declare @x xml;
select @x=
(
select ','+b
from
(
select 'hello;friend<&>'+CHAR(13)+CHAR(10)
union all
select 'my;world'+CHAR(13)+CHAR(10)
)a(b)
for xml path('')
);
select @x.value('.','nvarchar(max)');
Magoo's query preserves the special XML characters because of the VALUE method. There's no need for the TYPE directive because @x was declared XML;
In this scenario the optimizer just ignores it because there's no reason to convert XML to XML.
You cannot use the VALUE method, however, on a non-XML data type. Note what happens if you run Magoo's query with @x declared as varchar(<whatever>)
Msg 258, Level 15, State 1, Line 27
Cannot call methods on varchar.
With all that in mind, let's look at this query that concatinates the @table1.txt column which is varchar(100):
DECLARE @table1 TABLE(txtID int, txt varchar(100));
DECLARE @StringCol TABLE(txtID int, txt varchar(100));
INSERT @table1 VALUES (1,'xxx'),(1,'<<yyy>>'),(2,'zzz'),(2,'abc'),(2,'999');
-- Note: NO VALUE method
SELECT txtID, csv =
(
SELECT ','+txt
FROM @table1 tb
WHERE ta.txtID = tb.txtID
FOR XML PATH('')
)
FROM @table1 ta
GROUP BY txtID;
If you run the above query, the results are all jacked up. Ok, fine, let's add the VALUE method to remove the special XML characters.
DECLARE @table1 TABLE(txtID int, txt varchar(100));
DECLARE @StringCol TABLE(txtID int, txt varchar(100));
INSERT @table1 VALUES (1,'xxx'),(1,'<<yyy>>'),(2,'zzz'),(2,'abc'),(2,'999');
SELECT txtID, csv =
(
SELECT ','+txt
FROM @table1 tb
WHERE ta.txtID = tb.txtID
FOR XML PATH('')
).value('.','varchar(8000)')
FROM @table1 ta
GROUP BY txtID;
Msg 258, Level 15, State 1, Line 77
Cannot call methods on nvarchar(max).
Fine! We can convert the subquery to XML or use the more convenient TYPE Directive. Both of these produce the same execution plan but TYPE is simpler.
DECLARE @table1 TABLE(txtID int, txt varchar(100));
DECLARE @StringCol TABLE(txtID int, txt varchar(100));
INSERT @table1 VALUES (1,'xxx'),(1,'<<yyy>>'),(2,'zzz'),(2,'abc'),(2,'999');
SELECT txtID, csv =
CONVERT
(
XML,
(
SELECT ','+txt
FROM @table1 tb
WHERE ta.txtID = tb.txtID
FOR XML PATH('')
)
).value('.','varchar(8000)')
FROM @table1 ta
GROUP BY txtID;
SELECT txtID, csv =
(
SELECT ','+txt
FROM @table1 tb
WHERE ta.txtID = tb.txtID
FOR XML PATH(''), TYPE
).value('.','varchar(8000)')
FROM @table1 ta
GROUP BY txtID;
Hopefully that clears things up. 😛
-- Itzik Ben-Gan 2001
Viewing 15 posts - 1 through 15 (of 18 total)
You must be logged in to reply to this topic. Login to reply