January 27, 2012 at 1:13 pm
Is there a really easy way to search for a string in an XML column? All the examples I come across seem to require shredding the XML to some extent, such as specifying whether looking within an attribute vs an element, needing to specify nodes , etc..
For example, if I need to determine if a given string exists anywhere within an XML column, I'd like the query to return all the relevant rows -- and it's ok if the XML field is still returned as XML.
Thanks in advance,
--Pete
January 27, 2012 at 1:24 pm
By search for a string, do you mean search for a word within the XML, or search for a substring within the XML?
If words, then Full Text Indexing can work on XML columns in tables. http://msdn.microsoft.com/en-us/library/bb522491.aspx
If substrings, then that's more complex.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
January 29, 2012 at 12:32 pm
HI, i have been working on a task that sounds similar to what you suggest, i will add my code in here on what i did when i get to work tomorrow however, in summery i used a substring function and run it through a while loop to start at the beginning of a string look through and find the first occurance of what i was lookingng for and set a length based of another parameter. after finding the first one it then continues to parse the XML string and find all the other values. My result set ends with a table containing rows of data 1 for each occorancece.
***The first step is always the hardest *******
January 29, 2012 at 1:38 pm
I'm not sure if I understood correctly, but it seems like you're looking for the "exist" method to query the xml data.
Here's a short sample:
DECLARE @tbl TABLE
(
id INT,
col XML
)
INSERT INTO @tbl
VALUES
(1,'<name><name1>5</name1></name>'),(2,'<name>6</name>')
DECLARE @id int
SELECT @id = 6
SELECT *
FROM @tbl t1
WHERE t1.col.exist('//*[text()=sql:variable("@id")]')=1
January 29, 2012 at 2:57 pm
And here's the query to search for a substring:
SELECT *
FROM @tbl t1
WHERE col.exist('//*/text()[contains(.,sql:variable("@id"))]') = 1
Depending on the size of the xml data this may take a while since -at least to my knowledge- "contains" won't benefit from an xml index that might exists.
January 30, 2012 at 2:32 am
as promised here is what i did
declare @start int,
@end int,
@st int
set @start =1
set @st=0
while @start !=0
begin
set @start=(select cast (CHARINDEX('<baseshape ',xml_data,@st) as varchar)
from #xml_table)
set @end=(select cast (CHARINDEX('</BaseShape>',xml_data,@start)-@start+12 as varchar)
from #xml_table)
insert into #xml select SUBSTRING (xml_data,@start,@end)
from #xml_table
set @st=@start+10
end
***The first step is always the hardest *******
January 30, 2012 at 6:59 am
To:
GSquared
AccidentalDBA:)
LutzM
Thanks for all of your replies. I plan on testing them out for better understanding. Ultimately, I'll need to grow my knowledge of XML (which currently stands at near-zero). At the very least, I've come across plenty of articles on the fundamentals of XML. But for the immediate moment, your suggestions should get me to the next point for now.
Thanks again,
Pete
January 30, 2012 at 7:14 am
You're welcome.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
October 22, 2014 at 1:04 am
I think Text Finding may meet your needs. Using this wonderful tool is a good and easy way to search text in all types of files. It is definitely worth a try and it doesnt hurt when you can try their free version too. They provide outstanding support!!
October 31, 2016 at 2:09 pm
I don't understand this.
Using the above example, I changed it to the following:
[Code]
DECLARE @tbl TABLE
(
id INT,
col XML
)
INSERT INTO @tbl
VALUES
(1,'<name><name1>aht</name1></name>'),(2,'<name>6</name>')
DECLARE @id VARCHAR(10)
SELECT @id = 'aht'
SELECT *
FROM @tbl t1
WHERE t1.col.exist('//*[text()=sql:variable("@id")]')=1
[/code]
Which works, but when I do the following, it doesnt.
[Code]
use MSDB
GO
--create a temp table to hold xml for SSIS packages
Create Table #temp (name sysname, ssispackagexml XML)
insert into #temp (name, ssispackagexml)
SELECT [name] AS SSISPackageName
, CONVERT(XML, CONVERT(VARBINARY(MAX), packagedata)) AS SSISPackageXML
FROM msdb.dbo.sysssispackages
WHERE name = 'AHT'
;WITH cte AS (
SELECT
ROW_NUMBER() OVER (PARTITION BY name ORDER BY name) Num ,name
,ssispackagexml
FROM
#temp
)
SELECT * INTO #t2 FROM cte WHERE num = 1
DECLARE @id VARCHAR(10)
SELECT @id = 'aht'
SELECT *
FROM #t2
WHERE #t2.ssispackagexml.exist('//*[text()=sql:variable("@id")]')=1
[/code]
Now I realize, you can't test this, because I didnt include example data, however it works when I use then <name>value</name> syntax, but anything I pull out of MSDB.dbo.ssispackagexml, doesnt work.
Any ideas?
November 1, 2016 at 4:32 am
If your objective is to check whether the XML column contains text without caring where in the XML the text is, why not convert the XML column to a varchar and use LIKE? The below example populates a table @t with some random XML, then uses a plain SELECT to find the XML rows containing 'a':
declare @t table (xmlCol xml);
insert into @t values ('<thing>Apple</thing>');
insert into @t values ('<thing>Banana<thing2 type="fruit"/></thing>');
select * from @t where Convert(varchar(max),xmlCol) like '%a%';
This should be OK unless you have a massive amount of data.
November 1, 2016 at 7:07 pm
Does this help ?
DECLARE @tbl TABLE
(
id INT,
col XML
)
INSERT INTO @tbl
VALUES
(1,'<name><name1>aht</name1></name>'),(2,'<name>6</name>')
select *
from @tbl
where col.value('(/name)[1]','varchar(10)')='aht'
----------------------------------------------------
November 2, 2016 at 1:34 am
Hi
thanks for both your replies.
@mmartin1 I figured out why that doesnt work, its because my tablename is not a tag/component, but part of a string that is located somewhere in the XML.
Im not concerned where it is located, just if it exist in the xml string.
@william Rayer
Problem with casting as varchar, is as far as I can see, it truncates the XML because varchar(max) is not large enough to contain the data.
This is an example directly from the XML.
[Code]
<SQLTask:SqlTaskData xmlns:SQLTask="www.microsoft.com/sqlserver/dts/tasks/sqltask" SQLTask:Connection="{00-00-00-00}" SQLTask:TimeOut="0" SQLTask:IsStoredProc="False" SQLTask:BypassPrepare="True" SQLTask:SqlStmtSourceType="DirectInput" SQLTask:SqlStatementSource="SELECT CONVERT(varchar(25),max([Dato]),120) AS MAXDATO FROM [DB1].[dbo].[AHT] WHERE Org <> ''NSC'' and source = ''OS1''" SQLTask:CodePage="1252" SQLTask:ResultType="ResultSetType_SingleRow"></SQLTask>
[/Code]
Using the above, i cant seem to find the AHT table. because its part of the SQLstatement source.
This is just one example, there are many others, where tablenames are located in SSIS variables, or OLE DB connections or ADO.NET connectionstring etc.
November 2, 2016 at 2:37 am
Locating a string pattern within a string is straight forward, regardless of whether the string is an XML or not, here is a quick example of a search in an XML string.
😎
USE TEEST;
GO
SET NOCOUNT ON;
---
DECLARE @PLAN_XML XML = N'<BatchSequence>
<Batch>
<Statements>
<SearchString SSID="1">Search text</SearchString>
<SearchString SSID="2">Search text</SearchString>
<SearchString SSID="3">Search text</SearchString>
</Statements>
</Batch>
</BatchSequence>';
DECLARE @SEARCH_STRING NVARCHAR(MAX) = N'Search text';
SELECT
(
DATALENGTH(CONVERT(NVARCHAR(MAX),@PLAN_XML,0))
- DATALENGTH(REPLACE(CONVERT(NVARCHAR(MAX),@PLAN_XML,0),@SEARCH_STRING,N''))
) / DATALENGTH(@SEARCH_STRING) AS INSTANCE_COUNT
,CHARINDEX(@SEARCH_STRING,CONVERT(NVARCHAR(MAX),@PLAN_XML,0),1) AS FIRST_POS_CHARINDEX
,PATINDEX(NCHAR(37) + @SEARCH_STRING + NCHAR(37),CONVERT(NVARCHAR(MAX),@PLAN_XML,0)) AS FIRST_POS_PATINDEX
;
Output
INSTANCE_COUNT FIRST_POS_CHARINDEX FIRST_POS_PATINDEX
-------------------- -------------------- --------------------
3 58 58
November 2, 2016 at 7:03 am
Thank you!
it works, right out of the box.
I don't know why I didn't think of this.
I guess I got caught up in the fact that it was XML.
Viewing 15 posts - 1 through 15 (of 25 total)
You must be logged in to reply to this topic. Login to reply