needing to deveop a sql search for all words, no matter the order

  • I am trying to code in TSQL (without using the full-text indexing engine) a way to search fields for all words no matter the order.

    I did the left to right all word search using like key word. So e.g. if a user supplies 'civil war' i simply replace the beginning, end and space with the % sign and the correct results are returned. It also works with 'civ wa' again because it is from left to right.

    Ok, so now i am looking for a way to search for all words, no matter the order. So if the user supplies me with 'war civil' I still want to pull out the same rows as 'civil war'. My thoughts would be to parse the words into a temp table then use a for next loop to search the search fileds. If a match is found, mark the word and move to the next mark the word when found and move to the next etc.. until all words are found.

    For a huge database, and a lot of users, this would be very time consuming. Yes i plan on placing indexes on the search fields, but has anyone already done this? I am googling but cannot find something using just basic TSQL...full-text yes, but i don't want to have to use the full text capabilities. KISS comes to mind.

    Many thanks for any help.

  • I have never before attempted this feat without a net, but my wife is asleep, so I'll take a shot at it. You DO know this is going to do a table scan, don't you? How many words are we talking about in your search argument?

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • i just had a very similar question in a different thread...it was to scan all columns in a table where for all three search terms occurs

    basically, i just did a REPLACE on the space....to make the WHERE statement , so you could then execute a string;

    this works for any number of search terms, whether 1 word or more than one...but BobHovious's advice still stands: this is a table scan that might cost some CPU cycles.

    see if this helps:

    DECLARE @SEARCHSTRING VARCHAR(8000),

    @COLUMNNAME VARCHAR(128)

    SET @SEARCHSTRING='civil war memorabilia'

    SET @COLUMNNAME = 'YOURTABLE.COLUMNNAME'

    SET @SEARCHSTRING = 'WHERE CHARINDEX(''' + REPLACE(@SEARCHSTRING,' ',''',' + @COLUMNNAME + ') > 0 AND CHARINDEX(''') + ''',' + @COLUMNNAME + ') > 0'

    print @SEARCHSTRING

    -- results:

    WHERE CHARINDEX('civil',YOURTABLE.COLUMNNAME) > 0

    AND CHARINDEX('war',YOURTABLE.COLUMNNAME) > 0

    AND CHARINDEX('memorabilia',YOURTABLE.COLUMNNAME) > 0

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • EDITED TO REPORT TEST RESULTS

    Use Lowell's dynamic SQL method above. It's simpler, faster, and will only increase its lead as the number of words in the

    search string increases. I just added it to my library in case I run across this requirement.

    Just a note, both of us neglected to take into account words that might be part of a longer string. If we each search

    for 'baker' to appear, we will also find a hit on "bakery". So be sure to search for a character string that begins

    and ends with a space. This means you should pad the values of the column being searched with leading and trailing

    blanks in an expression in the where clause.

    ----------------- Cross Join Madness

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 15 ms, elapsed time = 383 ms.

    (15 row(s) affected)

    (5040 row(s) affected)

    SQL Server Execution Times:

    CPU time = 125 ms, elapsed time = 1211 ms.

    ----------------- Dynamic SQL

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    Select words

    from #searchTarget

    where CHARINDEX('baker',words) > 0 AND CHARINDEX('apple',words) > 0 AND CHARINDEX('cappucino',words) > 0

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server parse and compile time:

    CPU time = 0 ms, elapsed time = 69 ms.

    (5040 row(s) affected)

    SQL Server Execution Times:

    CPU time = 16 ms, elapsed time = 790 ms.

    SQL Server Execution Times:

    CPU time = 16 ms, elapsed time = 868 ms.

    Okay.... I really don't know if I'd do this in a production environment or not, but it was a hoot getting it to work without resorting to dynamic SQL.

    -- this will house the searchString to be searched for

    declare @input varchar(max)

    set @input = 'baker apple cappucino'

    ----------------------------------------------------------------------------------

    -- first we build a word list (just to set up the test)

    ----------------------------------------------------------------------------------

    declare @searchString table (word varchar(20))

    insert into @searchString

    select 'Apple' union all

    select 'Baker' union all

    select 'Cappucino' union all

    select 'Delta' union all

    select 'Echolalia' union all

    select 'Forger' union all

    select 'Wumpus';

    ----------------------------------------------------------------------------------

    -- next a home for various combinations (again, just to set up the test)

    -- if the search works correctly it should find all 5040 rows

    ----------------------------------------------------------------------------------

    declare @searchTarget table (ID int identity(1,1) primary key, words varchar(max))

    ;with

    cte1 as (select w1.word + ' ' + w2.word as searchString from @searchString w1 cross join @searchString w2 where w1.word <> w2.word)

    ,cte2 as (select w1.word + ' ' + searchString as searchString from @searchString w1 cross join cte1 where searchString not like '%'+w1.word+'%')

    ,cte3 as (select w1.word + ' ' + searchString as searchString from @searchString w1 cross join cte2 where searchString not like '%'+w1.word+'%')

    ,cte4 as (select w1.word + ' ' + searchString as searchString from @searchString w1 cross join cte3 where searchString not like '%'+w1.word+'%')

    ,cte5 as (select w1.word + ' ' + searchString as searchString from @searchString w1 cross join cte4 where searchString not like '%'+w1.word+'%')

    ,cte6 as (select w1.word + ' ' + searchString as searchString from @searchString w1 cross join cte5 where searchString not like '%'+w1.word+'%')

    insert into @searchTarget

    select searchString from cte6

    ----------------------------------------------------------------------------------

    -- and finally a home for various search strings to come to rest, this could be

    -- be done with a SELECT/INTO but I'm too lazy to drop tables tonight

    ----------------------------------------------------------------------------------

    declare @searchSource table (searchstring varchar(max))

    ----------------------------------------------------------------------------------

    -- LET THE GAMES BEGIN

    ----------------------------------------------------------------------------------

    set statistics time on;

    -- parse searchsearchString

    SET @input = ' '+@input+' '

    ;with cte0 (search) as

    (select '%'+substring(@input,N+1,charindex(' ',@input,N+1)-(N+1))+'%' as word

    from tally

    where substring(@input,N,1) = ' '

    and N < len(@input)

    )

    -- now everybody COMBO some more

    ,cte1 as (select c1.search + ' ' + w2.search as searchString from cte0 c1 cross join cte0 w2 where c1.search <> w2.search)

    ,cte2 as (select c1.search + ' ' + searchString as searchString from cte0 c1 cross join cte1 where searchString not like '%'+c1.search+'%')

    ,cte3 as (select c1.search + ' ' + searchString as searchString from cte0 c1 cross join cte2 where searchString not like '%'+c1.search+'%')

    ,cte4 as (select c1.search + ' ' + searchString as searchString from cte0 c1 cross join cte3 where searchString not like '%'+c1.search+'%')

    ,cte5 as (select c1.search + ' ' + searchString as searchString from cte0 c1 cross join cte4 where searchString not like '%'+c1.search+'%')

    ,cte6 as (select c1.search + ' ' + searchString as searchString from cte0 c1 cross join cte5 where searchString not like '%'+c1.search+'%')

    -- need to extend this to hold more search words, if necessary

    ,cteX as

    (select * from cte6

    union all

    select * from cte5

    union all

    select * from cte4

    union all

    select * from cte3

    union all

    select * from cte2

    union all

    select * from cte1

    union all

    select * from cte0

    )

    insert into @searchSource

    select * from cteX

    select words

    from @searchTarget

    cross join @searchSource

    where words like searchString

    and len(searchstring)= (select max(len(searchString)) from @searchSource)

    set statistics time off;

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • Didn't realize you could 'execute' a string...now i know. Thanks guys!!!

  • Ok, now i need to execute the same query and give the user the option of bringing back rows with all of the words(in any order) or any of the words(in any order). Maybe i can figure this out all by myself. :w00t:

  • I'll give you a hint. You already have the solution to find all the words by generating a WHERE clause with a bunch of ANDs. To find ANY word, you just use ORs instead ANDs in your dynamic SQL string. So have the application pass a variable with the word "AND" or "OR" in it, and use that instead of a constant in the REPLACE line. Let me know if you run into any problems with that. 🙂

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • ummm, ya....it just occured to me...out of embarrassment, i came here to delete this question, and ouch, you had already answered it.

    i have not had my coffee this morning :hehe:

  • Is being no problem. Happens to me all the time 😛

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply