How does DTA decide on an index for large join queries?

  • When DTA creates index on large joins, does it analyse each table based on the entire query, or just based on the selectivity of the data in the table by itself?

    eg

    SELECT c.statement_no

    , s.statement_dt

    , c.charge_amt

    , p.provider_name

    , m.lastname

    FROM dbo.Charge AS c WITH (INDEX (test))

    INNER JOIN dbo.provider AS p

    ON p.provider_no = c.provider_no

    INNER JOIN dbo.member AS m

    ON c.member_no = m.member_no

    INNER JOIN dbo.statement AS s

    ON c.statement_no = s.statement_no

    INNER JOIN dbo.region AS r

    ON r.region_no = m.region_no

    WHERE r.region_name = 'Japan'

    AND c.charge_amt > 2500

    execution plan shows the biggest bottleneck of this query is a clustered index scan of the charge table. because of this i created two indexes:

    CREATE INDEX testa_1

    ON charge(charge_amt, member_no)

    go

    CREATE INDEX testa_2

    ON charge(member_no, charge_amt)

    go

    neither made any difference so i tried covering the query:

    CREATE INDEX testb_1

    ON charge(charge_amt, member_no, provider_no, statement_no)

    CREATE INDEX testb_2

    ON charge(member_no, charge_amt, provider_no, statement_no)

    These indexes didnt make much difference either. I tried DTA and it recommended this index:

    CREATE NONCLUSTERED INDEX [_dta_index_charge_11_229575856__K2_K6_K3_K7] ON [dbo].[charge]

    (

    [member_no] ASC,

    [charge_amt] ASC,

    [provider_no] ASC,

    [statement_no] ASC

    )WITH (SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF) ON [PRIMARY]

    I dont think id come up with this index based just on the selectivity of the data in the charge table, but analysing this data based on the data in the other tables charge is joining to seems like quite a challenge to do manually. Imagine doing this for a query with 20 or 40 joins? it would be impossible. is this what DTA is doing, and do you professionals use DTA or would you come up with the above index yourself?

  • No, I don't know how exactly DTA determines which indexes it recommends and I only use it occassionally.

    But you say you wouldn't come up with the recommended index and yet your index testb_2 is exactly what DTA has recommended.

    [font="Verdana"]Markus Bohse[/font]

  • MarkusB (1/13/2009)


    No, I don't know how exactly DTA determines which indexes it recommends and I only use it occassionally.

    But you say you wouldn't come up with the recommended index and yet your index testb_2 is exactly what DTA has recommended.

    sorry, i should have said that although i came up with that index, it was more of an educated guess, i tried a few covering indexes. but i done no analysis to prove index testb_2 would work.

    i mean I know how it got the columns, but how did it decide on the order of the columns in the index?

  • I don't know the precise mechanisms (you'll need to ask Microsoft) that the DTA uses, but I know that it relies on the statistics on the indexes and tables and the parameters you pass in. So the output is only as good as the input. You need to make sure you're using a fairly common value, not overly selective or overly broad. You need to make sure the statistics on the table and indexes are up to date, possibly updating them with a full scan. All this will help the DTA make as good a set of choices as it can.

    Personally, I've never found the DTA to be accurate enough on the really complex queries that I was working on. If that's the case here, you might want to post the execution plan (and, if possible, the structure, sample data and the query) to see if others can help identify the issues.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey (1/13/2009)


    Personally, I've never found the DTA to be accurate enough on the really complex queries that I was working on. If that's the case here, you might want to post the execution plan (and, if possible, the structure, sample data and the query) to see if others can help identify the issues.

    the query in question is only a test, to help me learn about indexing strategies. because of this, i dont think it necessary to post the execution plan. I was just investigating, so that when faced with such a scenario, possibly involving 10 or more joins i would be able to debug and improve performance.

  • Understood. Yeah, unfortunately there really is no cut and dry answer when it comes to performance tuning for queries hence why DTA can sometimes struggle. Ultimately what you are doing, testing with complex queries and figuring out the best way to tune both query and indexes, is the best thing to do as that will prepare for the complex queries.

    Grant actually wrote a great E-Book that was free for a long time, don't know if it is anymore (Grant?) on understanding the Execution plans. Very helpful when working through tuning efforts.

    David

    @SQLTentmaker

    “He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot

  • DTA's got two undocumented features that it uses to do evaluations of cost improvement. One is the hypothetical index (an index that has all the metadata and stats of a regular index, but doesn't have the b-tree) and the second is a DBCC command and associated set option that tells the optimiser to optimise as if the hypothetical indexes were real.

    Hence what it can (and does) do is create various possible indexes, as hypothetical indexes, and then ask the optimiser to tell it the cost of a query using that hypothetical index.

    Run profiler while you're doing a DTA session to see exactly what it's up to. It's ... educational.

    Example:

    CREATE INDEX [_dta_stat_149575571_2_4] on [WebForums].[dbo].[Posts] ([ThreadID], [Poster]) WITH STATISTICS_ONLY = -1

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (1/13/2009)


    DTA's got two undocumented features that it uses to do evaluations of cost improvement. One is the hypothetical index (an index that has all the metadata and stats of a regular index, but doesn't have the b-tree) and the second is a DBCC command and associated set option that tells the optimiser to optimise as if the hypothetical indexes were real.

    Hence what it can (and does) do is create various possible indexes, as hypothetical indexes, and then ask the optimiser to tell it the cost of a query using that hypothetical index.

    Run profiler while you're doing a DTA session to see exactly what it's up to. It's ... educational.

    Example:

    CREATE INDEX [_dta_stat_149575571_2_4] on [WebForums].[dbo].[Posts] ([ThreadID], [Poster]) WITH STATISTICS_ONLY = -1

    Let me restate "Ask Microsoft or Gail"

    There, that's better.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • DavidB (1/13/2009)


    Understood. Yeah, unfortunately there really is no cut and dry answer when it comes to performance tuning for queries hence why DTA can sometimes struggle. Ultimately what you are doing, testing with complex queries and figuring out the best way to tune both query and indexes, is the best thing to do as that will prepare for the complex queries.

    Grant actually wrote a great E-Book that was free for a long time, don't know if it is anymore (Grant?) on understanding the Execution plans. Very helpful when working through tuning efforts.

    I think the ebook is still available from Red Gate. There will be a new print run of the book coming out shortly.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey (1/13/2009)


    Let me restate "Ask Microsoft or Gail"

    There, that's better.

    Nah. I know what it runs and vaguely what that does, but not the details. I mean, DBCC Autopilot????

    Besides, that's not how DTA figures out what indexes to add, that's how it figures out what cost improvements you're gonna see by adding them.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • DavidB (1/13/2009)


    Understood. Yeah, unfortunately there really is no cut and dry answer when it comes to performance tuning for queries hence why DTA can sometimes struggle. Ultimately what you are doing, testing with complex queries and figuring out the best way to tune both query and indexes, is the best thing to do as that will prepare for the complex queries.

    Grant actually wrote a great E-Book that was free for a long time, don't know if it is anymore (Grant?) on understanding the Execution plans. Very helpful when working through tuning efforts.

    I just got this link from Twitter.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply