August 23, 2010 at 3:17 pm
I need help / any ideas to perform advanced search on a string...
For ex: I have a string like below.
"SQL Server 2005 Express Edition is the next version of MSDE and is a free, easy-to-use, lightweight, and embeddable version of SQL Server 2005."
User's search would be Express near4("next")
Sentence should have "Express" word and "next" word should be there in either left or right side 4 words of "Express".
In the above sentence "next" is the 4th word (right side) from "Express" word so this should come as result...
If user enters Express near3("next") -- above sentence should not come because "next" word is not there 3 words....either side
August 23, 2010 at 5:59 pm
I've never used full-text search, but I think that it has the capabilities that you are looking for. It's certainly a good place to start.
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
August 23, 2010 at 6:20 pm
Hi Allen
Thanks for your reply...
Definitely FTS can do nearness search....but we cannot control that it should be within 3 words or 4 words ....beside to the main word....
Also I will have many many databases in a server.... so maintenance of FTS is difficult... and FTS index will have some latency in updating it...
Thanks,
Surya Prasad.
August 27, 2010 at 11:28 am
declare @searchTable table( TextValue varchar(max))
insert into @searchTable values ('SQL Server 2005 Express Edition is the next version of MSDE and is a free, easy-to-use, lightweight, and embeddable version of SQL Server 2005.')
select * from @searchTable T
cross apply dbo.fnsplit(' ', T.TextValue ) D
cross apply dbo.fnsplit(' ', T.TextValue) N
where
D.Data = 'Express' -- target
and N.Data = 'next'-- 2nd target
and N.ItemNum >=D.ItemNum - 4 -- N.Data Items that are within 4 below D.Data
and N.ItemNum <=D.ItemNum + 4 -- N.Data Items that are within 4 above D.Data
--- supporting function:
CREATE FUNCTION [dbo].[fnsplit](
@Delimiter nvarchar(max)
,@String nvarchar(max)
)
RETURNS TABLE AS
RETURN
SELECT
SUBSTRING(@String, Numbers.N+1, CHARINDEX(@Delimiter, @String + @Delimiter, Numbers.N+1) - Numbers.N-1) AS Data
, Numbers.N as Pos
,( Numbers.N) - LEN(REPLACE(LEFT(@String, Numbers.N), @Delimiter, N'')) + 1 AS ItemNum
FROM dbo.Tally Numbers with (nolock) -- http://www.sqlservercentral.com/articles/T-SQL/63003/ (mines Zero based)
WHERE Numbers.N <= LEN(@String)
AND SUBSTRING(@Delimiter + @String, Numbers.N+1, 1) = @Delimiter;
You will still need to deal with the hard work of punctuation and multiple spaces.
Strip them out before hand or alter the split function so that instead of a delimiter it uses a pattern matching (PatIndex('[^\W]%')) to locate next Punctuation or Space(s).
You also have to decide what the result of 2+ matches in one string? This currently will give multiple results.
August 27, 2010 at 1:51 pm
Thanks David....
Actually I started working on this and I am going in the similar directions that you mentioned.... In fact I am extracting words and their positions into other table when actual data is loaded...
Once I got that data then it is all number game to do search......
As Your idea is also in the similar directions it is giving me a comfort feeling that I am going in right direction..... Thanks a lot....:-)
I may need one more suggestion from you....
Number of words that I need to search is not always one, user can enter multiple words.... with dynamic SQL this can be done .... but any other idea will be helpful....
Once again thanks for your reply and idea.
Thanks for SQLServerCentral.com too....:-)
August 27, 2010 at 3:58 pm
[p]you can change the where clause to a like if you input is like Express near3("next,SQL"):[/p]
and ','+@searchString +',' like '%,'+N.Data+',%' -- commas are so substring words don't match - also input is stripped of any whitespace
it is a little harder if you want to find matching string phrase "next version"
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply