December 17, 2010 at 1:24 pm
Here's the scenario.
I have a 10000 record table where each row contains a large amount of text. I want to do a PATINDEX search to determine where a specific word occurs (let's say "dog" as an example). Since the patindex command only finds the first instance, I want to iterate using substrings to keep finding additional hits.
Right now this is the method I know how to do it (there may be a few little details missing, its mainly to show my thought process):
1) Create a field that contains my current patindex position.
2) Run the patindex command, update the table with the current patindex position for each row.
3) Store the patindex position in another table (or wherever, not really important for this exercise).
4) Create a dynamic sql query to do a substring on the text, starting at the current patindex position +1.
5) Run another patindex command on this new substring. Update the current patindex position with the new position.
6) Rinse and Repeat.
This method will work, but I was wondering if anyone had a cleaner and better way to do this (perhaps that didn't involve dynamic sql).
December 19, 2010 at 1:24 am
First things that pop in to my head:
1. CTE to identify occurrences of the desired string part
2. Table valued function using the CTE and then using CROSS APPLY
Perhaps others have better ideas?
December 21, 2010 at 11:28 pm
I'm dreaming up a solution using tally tables... What is your desired output? Number of occurrences, or indexes of occurrences?
December 22, 2010 at 12:02 am
Ok here we go...
You'll need to build a tally table. Jeff Moden has an excellent article about tally tables here[/url].
After you have a tally table, create a test table with test data: (or just plug in your own table I suppose)
select top 10000 identity(int,1,1) as id, Dog
into DogTable
from
(
select 'dogcatdogdogbirdape' as Dog
union all
select 'fishbirdnotadogoopsthatwasadog'
union all
select 'caninecanislupisdomesticus'
)x
cross join
(select sc1.name from master.dbo.syscolumns sc1
cross join
master.dbo.syscolumns sc2
) y
Now for the solution:
select id, count(id)
from
(
select d.id, d.dog, substring(d.dog, case when t.N > len(d.dog) then len(d.dog) else t.n end, 3) as test, n as foundindex
from dogtable d
cross join
tally t
where t.n < (select max(len(dog)) from dogtable)
and substring(d.dog, case when t.N > len(d.dog) then len(d.dog) else t.n end, 3) = 'dog'
)x
group by id
Run the inner view if you just want a list of indexes of the tested string.
This could be easily turned into a table valued function, just read the length of your input string into the last parameter of substring
August 5, 2013 at 12:38 pm
Sorry for necro'ing an old thread, but I'm trying to do something similar. The difference is that I'm looking for multiple strings (matching a pattern) of varying lengths. I've tried to modify the "dog" query, but am falling short of accomplishing what I want to accomplish. Can anyone help shed light?
FYI, I'm trying to search through stored procs and other objects for instances of linked servers, which follow a standard of "LS-%".
select object_name(s.id) Object, substring(s.text, n, charindex(' ', text, n) - n) as Target
from syscomments s
cross join
tally t
where t.n < (select max(len(text)) from syscomments)
and lower(substring(s.text, case when t.N > len(s.text) then len(s.text) else t.n end, case when t.N > len(s.text) then 5 else len(s.text) - n end)) like 'ls-__'
order by object, target
August 13, 2013 at 3:13 pm
That code's written for SQL 2000; I suggest completely ignoring it.
Instead, the code below should give you a list of object names and their full text reasonably quickly.
If you want to add in the part about showing the specific text that matches, let me know. Just be aware that'll be slower -- perhaps much slower?!
IF OBJECT_ID('tempdb..#search_strings') IS NOT NULL
DROP TABLE #search_strings
CREATE TABLE #search_strings (
search_id int IDENTITY(1, 1) NOT NULL,
search_string nvarchar(40) NOT NULL
)
INSERT INTO #search_strings
SELECT 'linked_server_name1'
--UNION ALL SELECT 'linked_server_name2'
--UNION ALL ...
SELECT
OBJECT_NAME(object_id) AS Object_Name,
search_string,
(SELECT definition FROM sys.sql_modules WHERE object_id = derived.object_id) AS Object_Text
FROM (
SELECT DISTINCT
sm.object_id,
ss.search_string
FROM sys.sql_modules sm
CROSS JOIN #search_strings AS ss
WHERE
CAST(OBJECTPROPERTYEX(sm.object_id, 'IsProcedure') AS int) = 1 AND
sm.definition LIKE '%' + ss.search_string + '%'
) AS derived
ORDER BY
Object_Name
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
August 14, 2013 at 11:56 am
Thanks for the code! It works great if I know what I'm searching for, but unfortunately to be useful in my current situation it would have to return the actual match. So I could search for "ls-" and it would return "ls-app1" and "ls-app2", etc.
August 14, 2013 at 12:56 pm
The code uses LIKE to compare the string, so just construct your string accordingly.
For example, you could use 'ls-' as the search string. Or, if you want a closer match, then:
ls-app[0-9]
or something similar.
I'll add code to see the actual part(s) of the text that match if you want.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
August 14, 2013 at 9:38 pm
ScottPletcher (8/13/2013)
That code's written for SQL 2000; I suggest completely ignoring it.
I wouldn't ignore it because it was written for SQL Server 2000... I'd ignore it because it solves the wrong problem for the latest request.
--Jeff Moden
Change is inevitable... Change for the better is not.
August 21, 2013 at 11:20 am
ScottPletcher (8/14/2013)
The code uses LIKE to compare the string, so just construct your string accordingly.<snip>
I'll add code to see the actual part(s) of the text that match if you want.</snip>
Yes, please!
I don't even know where to begin for that...
August 21, 2013 at 12:20 pm
CleverSQLUserID (8/21/2013)
ScottPletcher (8/14/2013)
The code uses LIKE to compare the string, so just construct your string accordingly.<snip>
I'll add code to see the actual part(s) of the text that match if you want.</snip>
Yes, please!
I don't even know where to begin for that...
Here's an example; naturally change the "backup" bytes and the total bytes listed to whatever you prefer:
IF OBJECT_ID('tempdb..#search_strings') IS NOT NULL
DROP TABLE #search_strings
CREATE TABLE #search_strings (
search_id int IDENTITY(1, 1) NOT NULL,
search_string nvarchar(40) NOT NULL
)
INSERT INTO #search_strings
SELECT 'linked_server_name1'
--UNION ALL SELECT 'linked_server_name2'
--UNION ALL ...
SELECT
OBJECT_NAME(object_id) AS Object_Name,
search_string,
--list 100 bytes of the definition, starting 40 bytes before the string itself was found
(SELECT SUBSTRING(definition, derived.string_location - 40, LEN(search_string) + 100) FROM sys.sql_modules WHERE object_id = derived.object_id) AS Object_Text
FROM (
SELECT
sm.object_id,
ss.search_string,
PATINDEX('%' + ss.search_string + '%', sm.definition) AS string_location
FROM sys.sql_modules sm
CROSS JOIN #search_strings AS ss
WHERE
CAST(OBJECTPROPERTYEX(sm.object_id, 'IsProcedure') AS int) = 1 AND
sm.definition LIKE '%' + ss.search_string + '%'
) AS derived
ORDER BY
Object_Name
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
August 21, 2013 at 10:11 pm
CleverSQLUserID (8/5/2013)
Sorry for necro'ing an old thread, but I'm trying to do something similar. The difference is that I'm looking for multiple strings (matching a pattern) of varying lengths. I've tried to modify the "dog" query, but am falling short of accomplishing what I want to accomplish. Can anyone help shed light?FYI, I'm trying to search through stored procs and other objects for instances of linked servers, which follow a standard of "LS-%".
select object_name(s.id) Object, substring(s.text, n, charindex(' ', text, n) - n) as Target
from syscomments s
cross join
tally t
where t.n < (select max(len(text)) from syscomments)
and lower(substring(s.text, case when t.N > len(s.text) then len(s.text) else t.n end, case when t.N > len(s.text) then 5 else len(s.text) - n end)) like 'ls-__'
order by object, target
Heh... I guess I'm necro'ing a necro'd thread but let me ask... are you all set on this?
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 12 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply