Questions about index choice in Execution Plan

  • I have attached DDL and test data, and also the actual execution plan of this query:

    DECLARE @url varchar(400)

    set @url = 'Environment_and_planning/Recycling_rubbish_and_waste/Recycling__information_and_advice'

    declare @ParentUrl varchar(400)

    -- query1

    SET @ParentUrl = (select top 1 parentUrl from dbo.SiteMap where Url = @url)

    -- query2

    SET @ParentUrl = (select top 1 parentUrl from dbo.SiteMap2 where Url = @url)

    -- query3

    SET @ParentUrl = (select top 1 parentUrl from dbo.SiteMap2 with(index(IX_url)) where Url = @url)

    Sitemap and Sitemap2 are almost identical, the difference being that Sitemap has an NC index on url, and an NC index on parentUrl, whereas Sitemap2 has an NC index on parentUrl, and an NC index on url, with parentUrl as an included column.

    I have included query1/the structure of dbo.Sitemap for completeness more than anything, as I think that is the set-up I will go for.

    My question lies more with query2 and query3.

    I was initially confused as to why the optimiser would use the IX_parentUrl index instead of the IX_url one, seeing as the SARG was the url field.

    Having seen the exec plan for query3 with the index hint, I'm guessing the optimiser chose to ignore this plan due to the fact that it would be more expensive to use IX_url? The exec plan shows it to have double the cost of the query2, relative to the batch

    In the case of query3, the extra cost arises from having to go all the way to the clustered index to pick up the location of this particular row, and from that find the value of parentUrl?

    query1 I suppose takes this cost out of the picture by having parentUrl in the included columns - no extra trip to the clustered index, and no RID Lookup if there were no clustered index?

    I started writing this post completely baffled but I *think* I understand it now, after more playing around - I would appreciate if someone could confirm that I am on the right track, or put me straight otherwise!

    Many thanks

    ------------------------------------------------------------------------
    Bite-sized fiction (with added teeth) [/url]

  • I would definitely say you are on the right track. Query1 will not need a bookmark lookup to the get the parentUrl because parentUrl is an included column. In every other instance you would need the bookmark/Key lookup.

    I'm not sure that you have the best clustered index either as I am assuming that the column, preferred, is an 2 value column so it would not be very selective as the first column in an index, which will get the statistics. If your clustered index was, url, preferred. you would not need the second index on url either and your current query becomes a clustered index seek.

    Here are a couple of good blog posts by Gail Shaw (GilaMonster) on indexing, http://feedproxy.google.com/~r/SqlInTheWild/~3/Oi1xzC6jiZ0/ and http://feedproxy.google.com/~r/SqlInTheWild/~3/qNiI731n59Q/

  • It's using the index on SiteMap2 that has both columns because that makes it a covering index.

    Don't use the cost estimates as a measure of query performance. They really are estimates and can be wildly different than the actual performance.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Jack Corbett (6/29/2009)


    I would definitely say you are on the right track. Query1 will not need a bookmark lookup to the get the parentUrl because parentUrl is an included column. In every other instance you would need the bookmark/Key lookup.

    I'm not sure that you have the best clustered index either as I am assuming that the column, preferred, is an 2 value column so it would not be very selective as the first column in an index, which will get the statistics. If your clustered index was, url, preferred. you would not need the second index on url either and your current query becomes a clustered index seek.

    Thanks Jack.

    The Preferred, url clustered index was due to experimentation - a very frequently run query has an ORDER BY Preferred, Url and the ensuing sort seemed to be the heavy part of the query, when the CI was Url, Preferred.

    Here are a couple of good blog posts by Gail Shaw (GilaMonster) on indexing, http://feedproxy.google.com/~r/SqlInTheWild/~3/Oi1xzC6jiZ0/ and http://feedproxy.google.com/~r/SqlInTheWild/~3/qNiI731n59Q/

    ...and thanks again!

    ------------------------------------------------------------------------
    Bite-sized fiction (with added teeth) [/url]

  • Grant Fritchey (6/29/2009)


    It's using the index on SiteMap2 that has both columns because that makes it a covering index.

    Don't use the cost estimates as a measure of query performance. They really are estimates and can be wildly different than the actual performance.

    Thanks Grant.

    I remember reading about the cost figures being estimates, even in the *actual* execution plan - what would one look at instead, to figure out which bit of the query is the heavy bit?

    ------------------------------------------------------------------------
    Bite-sized fiction (with added teeth) [/url]

  • mazzz (6/30/2009)


    I remember reading about the cost figures being estimates, even in the *actual* execution plan

    Yup, because the cost comes from the optimiser, before the query is actually run.

    what would one look at instead, to figure out which bit of the query is the heavy bit?

    Even given that they are estimates, I would still look at the costs of the various operators, once I was sure that the estimates were good ones (estimate and actual row counts are similar), just not take them as absolute values.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • mazzz (6/30/2009)


    Thanks Grant.

    I remember reading about the cost figures being estimates, even in the *actual* execution plan - what would one look at instead, to figure out which bit of the query is the heavy bit?

    Like Gail says, they're still useful indicators, mainly for comparing one plan to another in terms of function. They just don't reflect directly to performance. Instead you have to look at execution time & I/O.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Many thanks, folks.

    ------------------------------------------------------------------------
    Bite-sized fiction (with added teeth) [/url]

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply