Weird SP issue with CONTAINS taking 10 times longer than LIKE statement.

  • I am trying to make the following SP a bit quicker by using the CONTAINS statement vs the LIKE statement.

    In my testing of 200k records, i get the following results:

    When i use:

    "=" it takes 2 seconds

    "LIKE" it takes 4 seconds

    "CONTAINS" it takes 10 seconds

    I cant easily provide all the test data for this, so i am not quite following protocol. If that is an issue, i will seek assistance elsewhere. Thanks.

    USE [JobPortal]

    GO

    /****** Object: StoredProcedure [dbo].[GetJobListFullIndexSearched] Script Date: 01/25/2011 12:37:32 ******/

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    -- =============================================

    -- =============================================

    ALTER PROCEDURE [dbo].[GetJobListFullIndexSearched]

    -- Add the parameters for the stored procedure here

    @Title varchar(250),

    @Industry int,

    @JobTitle int,

    @JobType int,

    @Experience int,

    @education int,

    @State int,

    @City int,

    @Salary int,

    @MaxSalary int,

    @fromRec int,

    @toRec int

    AS

    BEGIN

    -- SET NOCOUNT ON added to prevent extra result sets from

    -- interfering with SELECT statements.

    SET NOCOUNT ON;

    -- Insert statements for procedure here

    if (@Salary = 0)

    Begin

    WITH Jobs AS

    (

    SELECT e.Id, e.Title,e.ContactPerson,

    case when len(e.description) < 150 then

    e.description

    else

    substring(e.description,1,150) + '...' end

    as Description,

    isnull((select we.[Name] from workexperiencetypes we where we.id=e.workexperiencetypeid),'') as workexperience,

    isnull((select jot.[Name] from jobtypes jot where jot.id=e.jobtypeid),'') as jobtype,

    isnull((select ind.[Name] from industries ind where ind.id=e.industryid),'') as industryname,

    isnull((select jt.[Name] from jobTitles jt where jt.id=e.jobtypeid),'') as jobTitle,

    isnull((select ci.[Name] from cities ci where ci.id=c.cityid),'') + ', ' +

    isnull((select r.[AbbreviatedName] from regions r where r.id=c.regionid),'') + ', ' +

    isnull((select co.[AbbreviatedName] from countries co where co.id=c.countryid),'') as Location,

    case when e.status = 1 then 'Active'

    else 'InActive' end as Status,Positions, substring(cast(e.deadline as varchar(20)),1,12) as deadline,

    ROW_NUMBER() OVER (ORDER BY e.Id DESC) AS RowNumber

    from dbo.EmploymentOpportunities e left outer join companys c on e.officeid=c.id

    where e.status=1 and

    CASE

    WHEN @Title <> '' THEN

    --CASE WHEN e.title = @Title THEN 1

    --CASE WHEN e.title like '%' + @Title + '%' THEN 1

    CASE WHEN CONTAINS(e.title, @Title) THEN 1

    ELSE 0 END

    ELSE 1 END = 1

    and

    CASE

    WHEN @Industry <> 0 THEN

    CASE WHEN e.industryid = @Industry THEN 1

    ELSE 0 END

    ELSE 1 END = 1

    and

    CASE

    WHEN @JobTitle <> 0 THEN

    CASE WHEN e.jobtitleid = @JobTitle THEN 1

    ELSE 0 END

    ELSE 1 END = 1

    and

    CASE

    WHEN @JobType <> 0 THEN

    CASE WHEN e.jobtypeid = @JobType THEN 1

    ELSE 0 END

    ELSE 1 END = 1

    and

    CASE

    WHEN @Experience <> 0 THEN

    CASE WHEN e.workexperiencetypeid = @Experience THEN 1

    ELSE 0 END

    ELSE 1 END = 1

    and

    CASE

    WHEN @education <> 0 THEN

    CASE WHEN e.AcademicExperienceTypeId = @education THEN 1

    ELSE 0 END

    ELSE 1 END = 1

    and

    CASE

    WHEN @State <> 0 THEN

    CASE WHEN c.regionid = @State THEN 1

    ELSE 0 END

    ELSE 1 END = 1

    and

    CASE

    WHEN @City <> 0 THEN

    CASE WHEN c.CityId = @City THEN 1

    ELSE 0 END

    ELSE 1 END = 1

    )

    SELECT Id, RowNumber, Title,ContactPerson,Description, workexperience, jobtype, industryname, jobTitle, Location,Status,Positions, deadline

    ,(select max(RowNumber) from Jobs) as TotalCount

    FROM Jobs

    WHERE RowNumber BETWEEN @fromRec AND @toRec

    ORDER BY RowNumber ASC

    end

    else

    Begin

    WITH Jobs AS

    (

    SELECT e.Id, e.Title,e.ContactPerson,

    case when len(e.description) < 150 then

    e.description

    else

    substring(e.description,1,150) + '...' end

    as Description,

    isnull((select we.[Name] from workexperiencetypes we where we.id=e.workexperiencetypeid),'') as workexperience,

    isnull((select jot.[Name] from jobtypes jot where jot.id=e.jobtypeid),'') as jobtype,

    isnull((select ind.[Name] from industries ind where ind.id=e.industryid),'') as industryname,

    isnull((select jt.[Name] from jobTitles jt where jt.id=e.jobtypeid),'') as jobTitle,

    isnull((select ci.[Name] from cities ci where ci.id=c.cityid),'') + ', ' +

    isnull((select r.[Name] from regions r where r.id=c.regionid),'') + '$, ' +

    isnull((select co.[Name] from countries co where co.id=c.countryid),'') as Location,

    case when e.status = 1 then 'Active'

    else 'InActive' end as Status,Positions, substring(cast(e.deadline as varchar(20)),1,12) as deadline,

    ROW_NUMBER() OVER (ORDER BY e.Id DESC) AS RowNumber

    from dbo.EmploymentOpportunities e left outer join companys c on e.officeid=c.id

    where e.status=1 and

    CASE

    WHEN @Title <> '' THEN

    --CASE WHEN e.title = @Title THEN 1

    --CASE WHEN e.title like '%' + @Title + '%' THEN 1

    CASE WHEN CONTAINS(e.title, @Title) THEN 1

    ELSE 0 END

    ELSE 1 END = 1

    and

    CASE

    WHEN @Industry <> 0 THEN

    CASE WHEN e.industryid = @Industry THEN 1

    ELSE 0 END

    ELSE 1 END = 1

    and

    CASE

    WHEN @JobTitle <> 0 THEN

    CASE WHEN e.jobtitleid = @JobTitle THEN 1

    ELSE 0 END

    ELSE 1 END = 1

    and

    CASE

    WHEN @JobType <> 0 THEN

    CASE WHEN e.jobtypeid = @JobType THEN 1

    ELSE 0 END

    ELSE 1 END = 1

    and

    CASE

    WHEN @Experience <> 0 THEN

    CASE WHEN e.workexperiencetypeid = @Experience THEN 1

    ELSE 0 END

    ELSE 1 END = 1

    and

    CASE

    WHEN @education <> 0 THEN

    CASE WHEN e.AcademicExperienceTypeId = @education THEN 1

    ELSE 0 END

    ELSE 1 END = 1

    and

    CASE

    WHEN @State <> 0 THEN

    CASE WHEN c.regionid = @State THEN 1

    ELSE 0 END

    ELSE 1 END = 1

    and

    CASE

    WHEN @City <> 0 THEN

    CASE WHEN c.CityId = @City THEN 1

    ELSE 0 END

    ELSE 1 END = 1

    --and (e.salaryminid = @Salary)

    --and (e.salaryminid < @Salary)

    and (e.salaryminid >= @Salary)

    and (e.salarymaxid <= @MaxSalary)

    )

    SELECT distinct Id, RowNumber, Title,ContactPerson,Description, workexperience, jobtype, industryname, jobTitle, Location,Status,Positions, deadline

    ,(select max(RowNumber) from Jobs) as TotalCount

    FROM Jobs

    WHERE RowNumber BETWEEN @fromRec AND @toRec

    ORDER BY RowNumber ASC

    End

    END

  • I'd check the execution plans. That's going to tell you what each operation is leading to. I would suspect that both LIKE and CONTAINS leads to a scan, but it depends on how you use LIKE. For example, x LIKE 'SomeValue' or x LIKE 'SomeValue%', can use an index, but x LIKE '%SomeValue' can't. I haven't investigated CONTAINS at length, but based on what it does and how it works, it must do a scan. Unless you've got a FULLTEXT index, I can't imagine it's going to work well. on smaller data sets because of the scan.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • I have defined FTI on the field i am searching.

    Thanks for your suggestions!

  • Then I suspect either the text is very small or FTI is not being used. Again, the exec plan should tell the tale.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • I ran an execution plan, but dont really have any experience in this area, so i opened a ticket with Microsoft.

    Thanks again!

  • isuckatsql (1/25/2011)


    I ran an execution plan, but dont really have any experience in this area, so i opened a ticket with Microsoft.

    Thanks again!

    If you post the execution plans here, we can probably point you in the right direction..and most likely a lot sooner than Microsoft will get back to you.

  • Derrick Smith (1/25/2011)


    isuckatsql (1/25/2011)


    I ran an execution plan, but dont really have any experience in this area, so i opened a ticket with Microsoft.

    Thanks again!

    If you post the execution plans here, we can probably point you in the right direction..and most likely a lot sooner than Microsoft will get back to you.

    Especially since the first tier of MS support probably couldn't find an exec plan if you paid them... wait a minute...

    But seriously, they're going to tell you to run the DTA and wash their hands until you do. That's unlikely to help. Post the exec plans here, we can look them over. It's what we do, and we're free.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Please find attached the execution plans.

    One using the 'like' statement and one using the 'contains' statement.

    Thanks!

  • Contains() is a much more powerful command and for your purposes I think it's overkill for a string lookup.

    Note the missing index warning at the top of the query plan for Like - almost all queries can be improved by a better index, but if the estimated query plan is noticing it, then it can be hugely improved.

    I don't have your data but try creating this index and running it again and see how long it takes:

    USE [JobPortalPBTechno]

    GO

    CREATE NONCLUSTERED INDEX idx_EmploymentOpportunities_Status_Z1

    ON [dbo].[EmploymentOpportunities] ([Status])

    INCLUDE ([Id],[OfficeId],[Title],[IndustryId],[JobTypeId],[WorkExperienceTypeId],[AcademicExperienceTypeId],[JobTitleId])

    GO

    Normally this step would involve looking at the tables, seeing how much data is in which columns, how specific the data is so you can filter the most on the first index field, etc..but again without having your tables/data handy, it's hard to go more in depth.

  • Derrick Smith (1/25/2011)


    Contains() is a much more powerful command and for your purposes I think it's overkill for a string lookup.

    Note the missing index warning at the top of the query plan for Like - almost all queries can be improved by a better index, but if the estimated query plan is noticing it, then it can be hugely improved.

    I don't have your data but try creating this index and running it again and see how long it takes:

    USE [JobPortalPBTechno]

    GO

    CREATE NONCLUSTERED INDEX idx_EmploymentOpportunities_Status_Z1

    ON [dbo].[EmploymentOpportunities] ([Status])

    INCLUDE ([Id],[OfficeId],[Title],[IndustryId],[JobTypeId],[WorkExperienceTypeId],[AcademicExperienceTypeId],[JobTitleId])

    GO

    Normally this step would involve looking at the tables, seeing how much data is in which columns, how specific the data is so you can filter the most on the first index field, etc..but again without having your tables/data handy, it's hard to go more in depth.

    I did notice the warning message.

    Are you suggesting NOT using 'Contains' and instead using 'like', even though 'like' is MUCH slower?

    I am down to 6 secs with the Contains statement.....better than 10 secs!

    Thanks!

  • You say 'Like' is slower, but that's not what your original post says..did you reverse those?

  • I tested with the LIKE statement and got this one down to 2 secs 🙂

    Are you saying that i should stick with CONTAINS for searching much larger strings, eg, a document and use LIKE for smaller strings?

    Thanks

  • It depends..

    Contains() is great for fuzzy lookups, searching for strings near other specific strings, thesaurus, etc...it does a lot of things. Since it is designed to do all that, there's a bit of overhead associated with using it, and it tends to use different processes to get the same end result. Notice the execution plan way over on the right hand side..Contains() is using internal functions and searches the table differently from Like %%.

    In your case, Like performs better. On a different set of data or new schema, Contains() might be better.

    That basically sums up SQL server...there is never one blanket answer for anything.

  • Thanks!

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply