August 13, 2014 at 12:45 pm
SQL Version: SQL 2008 R2 10.50.2550
There's a few parts to this so not sure if I should break this out to separate topics... I'm a little new to Full Text Searches so this first part I'm just generally curious about and then I'll elaborate on motive...
The current structure of the FTS is as follows
SELECT TOP 200 c.EmployerId, c.EmployeeId
FROM dbo.EmployeeFTS c
INNER JOIN CONTAINSTABLE
(
dbo.EmployeeFTS
,SearchText
,@query
) results
ON c.EmployeeId = results.[Key]
WHERE
c.EmployerId= @EmployerId
Order by results.Rank DESC
The idea is that it's a multi-tenant DB so all searches only need to occur over a given EmployerId. But the way this is written seems a little inefficient because it's searching over all employers and then after that filtering out other employer matches... It seems like it'd be better to be able to utilize top_n_rank of ContainsTable but in order to do that ContainsTable needs to only operate over a specific Employer's set of data... Is there a recommended way to go about this?
Search Text is a computed column which is concatenation of other columns in the EmployeeFTS table...
So what I've done so far is to add "EmployerId:<EmployerId>" to the concatenation of that computed column and then append that onto the query... so the search becomes
set @query = 'EmployerId:' + cast(@EmployerId as varchar(10)) + ' AND ' + @query
SELECT c.EmployerId, c.EmployeeId
FROM dbo.EmployeeFTS c
INNER JOIN CONTAINSTABLE
(
dbo.EmployeeFTS
,SearchText
,@query
,200
) results
ON c.EmployeeId= results.[Key]
WHERE
c.EmployerId= @EmployerId --I left this on just to be extra sure
From my initial tests the results returned are always the same, the logical reads from statistics is way lower, and the cost of the execution plan is a lot lower as well...
It FEELS like sort of a hack to do it this way though so I was curious if there's a better recommendation...
-I read this
http://technet.microsoft.com/en-us/library/cc721269%28SQL.100%29.aspx#_Toc202506246
Where it talks about being able to limit by using contains, but then I don't get ranking. And from the tests I did the containstable version I wrote above with the concatenated EmployerId performs better...
Additional thing I'm doing and some notes:
-I've enabled Trace Flag 7646 on staging and I'm going to move that to production here soon to see if that alleviates the issue. I'm not really that worried about the delay in accuracy this is going to cause.
-We're applying the latest SP to 2008 R2 later this month not really sure if there's something in there that will help
-There are about a million records in the table and it gains about 500 new records a day
Motivation:
On a daily basis the Employee searches are getting blocked by this
<command_type>FT GATHERER</command_type>
<block_info>
<lock_type>objectlock</lock_type>
<object_name>fulltext_docidfilter_2104498676</object_name>
</block_info>
Which is in turn blocked by some longer running Employee Search... So the blocking chain goes
-Long search with large number of reads blocks FT GATHERER
-FT GATHERER which is waiting for (1x: 56572ms)LCK_M_IX
-So now a bunch of other Employee searches start to get backed up behind the FT GATHERER...
Brent Ozar talking about stuff that seemed related
http://www.brentozar.com/archive/2008/11/stackoverflows-sql-2008-fts-issue-solved/
Which is why I'm wanting to optimize the search... There's a good chance I'll just leave Trace Flag 7646 enabled because I don't really see it causing any issues... There's also plans to move to Lucene so this might be a good catalyst for doing that as well 🙂
Thanks in advance for your time and thoughts
August 14, 2014 at 12:22 pm
Here's a script I whipped up that illustrates basically what I'm saying
use master
CREATE DATABASE FT_TEST
GO
use FT_TEST
GO
CREATE TABLE [dbo].[EmployeeFTS](
[EmployeeID] [int] NOT NULL,
[EmployerID] [int] NOT NULL,
[Names] [nvarchar](150) NOT NULL,
[PhoneNumbers] [nvarchar](max) NOT NULL,
[Address] [nvarchar](250) NOT NULL,
[SearchText] AS ((((([Names]+'; '))+([PhoneNumbers]+'; '))+([Address]))+((('|EmployerId:'+CONVERT([varchar](100),[EmployerId],0))))),
CONSTRAINT [PK_EmployeeFTS] PRIMARY KEY CLUSTERED
(
[EmployeeID] ASC
)
)
GO
CREATE NONCLUSTERED INDEX [EmployeeFTS_EmployerId]
ON [dbo].[EmployeeFTS] ([EmployerID])
INCLUDE ([EmployeeID])
GO
CREATE FULLTEXT CATALOG [FT_TEST_EmployeeFT] WITH ACCENT_SENSITIVITY = ON
GO
CREATE FULLTEXT STOPLIST [StopList_EmployeeFTS]
;
CREATE FULLTEXT INDEX ON [dbo].[EmployeeFTS] KEY INDEX [PK_EmployeeFTS] ON ([FT_TEST_EmployeeFT], FILEGROUP [PRIMARY]) WITH (CHANGE_TRACKING AUTO, STOPLIST [StopList_EmployeeFTS])
GO
ALTER FULLTEXT INDEX ON [dbo].[EmployeeFTS] ADD ([SearchText] LANGUAGE [English])
GO
ALTER FULLTEXT INDEX ON [dbo].[EmployeeFTS] ENABLE
GO
truncate table [EmployeeFTS]
GO
declare @i as int = 0
declare @iStr as varchar(10)
WHILE @i < 100000
BEGIN
set @iStr = cast(@i as varchar(10))
insert into [EmployeeFTS] (EmployeeId, EmployerId, Names, PhoneNumbers, [Address])
VALUES(@i, @i %100, 'NAme ' + @iStr, 'PhoneNumber ' + @iStr, 'Address ' + @iStr)
set @i = @i + 1
END
GO
SET STATISTICS IO ON
GO
declare @EmployerId as int = 2
declare @query as varchar(4000)
set @query = '(FORMSOF(INFLECTIONAL, NAme)) AND (FORMSOF(INFLECTIONAL, address))'
print @query
SELECT TOP 200 c.EmployerId, c.EmployeeId
FROM dbo.EmployeeFTS c
INNER JOIN CONTAINSTABLE
(
dbo.EmployeeFTS
,SearchText
,@query
) results
ON c.EmployeeId = results.[Key]
WHERE
c.EmployerId= @EmployerId
Order by results.Rank DESC
GO
declare @EmployerId as int = 2
declare @query as varchar(4000)
set @query = '(FORMSOF(INFLECTIONAL, NAme)) AND (FORMSOF(INFLECTIONAL, address))'
set @query = '\|EmployerId:' + cast(@EmployerId as varchar(10)) + ' AND ' + @query
print @query
SELECT c.EmployerId, c.EmployeeId
FROM dbo.EmployeeFTS c
INNER JOIN CONTAINSTABLE
(
dbo.EmployeeFTS
,SearchText
,@query
,200
) results
ON c.EmployeeId= results.[Key]
WHERE
c.EmployerId= @EmployerId
December 20, 2014 at 6:17 pm
I realize now I probably did a bad job of creating a narrow question 🙂
I ended up just going with
"what I've done so far is to add "EmployerId:<EmployerId>" to the concatenation of that computed column and then append that onto the query"
As a solution
So I added the EmployerID to the computed column that's used for the full text index...
And then all queries coming in I prefix that on the search string so that it filters down by employer first and then returns relevant results..
Worked great
Haven't had a single block on gatherer ever since and search is running faster then ever...
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply