March 6, 2009 at 5:14 pm
I am trying to code in TSQL (without using the full-text indexing engine) a way to search fields for all words no matter the order.
I did the left to right all word search using like key word. So e.g. if a user supplies 'civil war' i simply replace the beginning, end and space with the % sign and the correct results are returned. It also works with 'civ wa' again because it is from left to right.
Ok, so now i am looking for a way to search for all words, no matter the order. So if the user supplies me with 'war civil' I still want to pull out the same rows as 'civil war'. My thoughts would be to parse the words into a temp table then use a for next loop to search the search fileds. If a match is found, mark the word and move to the next mark the word when found and move to the next etc.. until all words are found.
For a huge database, and a lot of users, this would be very time consuming. Yes i plan on placing indexes on the search fields, but has anyone already done this? I am googling but cannot find something using just basic TSQL...full-text yes, but i don't want to have to use the full text capabilities. KISS comes to mind.
Many thanks for any help.
March 6, 2009 at 7:43 pm
I have never before attempted this feat without a net, but my wife is asleep, so I'll take a shot at it. You DO know this is going to do a table scan, don't you? How many words are we talking about in your search argument?
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
March 6, 2009 at 8:10 pm
i just had a very similar question in a different thread...it was to scan all columns in a table where for all three search terms occurs
basically, i just did a REPLACE on the space....to make the WHERE statement , so you could then execute a string;
this works for any number of search terms, whether 1 word or more than one...but BobHovious's advice still stands: this is a table scan that might cost some CPU cycles.
see if this helps:
DECLARE @SEARCHSTRING VARCHAR(8000),
@COLUMNNAME VARCHAR(128)
SET @SEARCHSTRING='civil war memorabilia'
SET @COLUMNNAME = 'YOURTABLE.COLUMNNAME'
SET @SEARCHSTRING = 'WHERE CHARINDEX(''' + REPLACE(@SEARCHSTRING,' ',''',' + @COLUMNNAME + ') > 0 AND CHARINDEX(''') + ''',' + @COLUMNNAME + ') > 0'
print @SEARCHSTRING
-- results:
WHERE CHARINDEX('civil',YOURTABLE.COLUMNNAME) > 0
AND CHARINDEX('war',YOURTABLE.COLUMNNAME) > 0
AND CHARINDEX('memorabilia',YOURTABLE.COLUMNNAME) > 0
Lowell
March 6, 2009 at 9:00 pm
EDITED TO REPORT TEST RESULTS
Use Lowell's dynamic SQL method above. It's simpler, faster, and will only increase its lead as the number of words in the
search string increases. I just added it to my library in case I run across this requirement.
Just a note, both of us neglected to take into account words that might be part of a longer string. If we each search
for 'baker' to appear, we will also find a hit on "bakery". So be sure to search for a character string that begins
and ends with a space. This means you should pad the values of the column being searched with leading and trailing
blanks in an expression in the where clause.
----------------- Cross Join Madness
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 15 ms, elapsed time = 383 ms.
(15 row(s) affected)
(5040 row(s) affected)
SQL Server Execution Times:
CPU time = 125 ms, elapsed time = 1211 ms.
----------------- Dynamic SQL
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
Select words
from #searchTarget
where CHARINDEX('baker',words) > 0 AND CHARINDEX('apple',words) > 0 AND CHARINDEX('cappucino',words) > 0
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 69 ms.
(5040 row(s) affected)
SQL Server Execution Times:
CPU time = 16 ms, elapsed time = 790 ms.
SQL Server Execution Times:
CPU time = 16 ms, elapsed time = 868 ms.
Okay.... I really don't know if I'd do this in a production environment or not, but it was a hoot getting it to work without resorting to dynamic SQL.
-- this will house the searchString to be searched for
declare @input varchar(max)
set @input = 'baker apple cappucino'
----------------------------------------------------------------------------------
-- first we build a word list (just to set up the test)
----------------------------------------------------------------------------------
declare @searchString table (word varchar(20))
insert into @searchString
select 'Apple' union all
select 'Baker' union all
select 'Cappucino' union all
select 'Delta' union all
select 'Echolalia' union all
select 'Forger' union all
select 'Wumpus';
----------------------------------------------------------------------------------
-- next a home for various combinations (again, just to set up the test)
-- if the search works correctly it should find all 5040 rows
----------------------------------------------------------------------------------
declare @searchTarget table (ID int identity(1,1) primary key, words varchar(max))
;with
cte1 as (select w1.word + ' ' + w2.word as searchString from @searchString w1 cross join @searchString w2 where w1.word <> w2.word)
,cte2 as (select w1.word + ' ' + searchString as searchString from @searchString w1 cross join cte1 where searchString not like '%'+w1.word+'%')
,cte3 as (select w1.word + ' ' + searchString as searchString from @searchString w1 cross join cte2 where searchString not like '%'+w1.word+'%')
,cte4 as (select w1.word + ' ' + searchString as searchString from @searchString w1 cross join cte3 where searchString not like '%'+w1.word+'%')
,cte5 as (select w1.word + ' ' + searchString as searchString from @searchString w1 cross join cte4 where searchString not like '%'+w1.word+'%')
,cte6 as (select w1.word + ' ' + searchString as searchString from @searchString w1 cross join cte5 where searchString not like '%'+w1.word+'%')
insert into @searchTarget
select searchString from cte6
----------------------------------------------------------------------------------
-- and finally a home for various search strings to come to rest, this could be
-- be done with a SELECT/INTO but I'm too lazy to drop tables tonight
----------------------------------------------------------------------------------
declare @searchSource table (searchstring varchar(max))
----------------------------------------------------------------------------------
-- LET THE GAMES BEGIN
----------------------------------------------------------------------------------
set statistics time on;
-- parse searchsearchString
SET @input = ' '+@input+' '
;with cte0 (search) as
(select '%'+substring(@input,N+1,charindex(' ',@input,N+1)-(N+1))+'%' as word
from tally
where substring(@input,N,1) = ' '
and N < len(@input)
)
-- now everybody COMBO some more
,cte1 as (select c1.search + ' ' + w2.search as searchString from cte0 c1 cross join cte0 w2 where c1.search <> w2.search)
,cte2 as (select c1.search + ' ' + searchString as searchString from cte0 c1 cross join cte1 where searchString not like '%'+c1.search+'%')
,cte3 as (select c1.search + ' ' + searchString as searchString from cte0 c1 cross join cte2 where searchString not like '%'+c1.search+'%')
,cte4 as (select c1.search + ' ' + searchString as searchString from cte0 c1 cross join cte3 where searchString not like '%'+c1.search+'%')
,cte5 as (select c1.search + ' ' + searchString as searchString from cte0 c1 cross join cte4 where searchString not like '%'+c1.search+'%')
,cte6 as (select c1.search + ' ' + searchString as searchString from cte0 c1 cross join cte5 where searchString not like '%'+c1.search+'%')
-- need to extend this to hold more search words, if necessary
,cteX as
(select * from cte6
union all
select * from cte5
union all
select * from cte4
union all
select * from cte3
union all
select * from cte2
union all
select * from cte1
union all
select * from cte0
)
insert into @searchSource
select * from cteX
select words
from @searchTarget
cross join @searchSource
where words like searchString
and len(searchstring)= (select max(len(searchString)) from @searchSource)
set statistics time off;
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
March 7, 2009 at 10:23 am
Didn't realize you could 'execute' a string...now i know. Thanks guys!!!
March 7, 2009 at 10:38 am
Ok, now i need to execute the same query and give the user the option of bringing back rows with all of the words(in any order) or any of the words(in any order). Maybe i can figure this out all by myself. :w00t:
March 7, 2009 at 11:03 am
I'll give you a hint. You already have the solution to find all the words by generating a WHERE clause with a bunch of ANDs. To find ANY word, you just use ORs instead ANDs in your dynamic SQL string. So have the application pass a variable with the word "AND" or "OR" in it, and use that instead of a constant in the REPLACE line. Let me know if you run into any problems with that. 🙂
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
March 7, 2009 at 12:10 pm
ummm, ya....it just occured to me...out of embarrassment, i came here to delete this question, and ouch, you had already answered it.
i have not had my coffee this morning :hehe:
March 7, 2009 at 1:57 pm
Is being no problem. Happens to me all the time 😛
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply