Why Index scan?

  • I have 2 identical queries running on 2 identical tables with almost the same number of rows in all 4 tables. The first one uses Index seek and runs for 3 sec. The second one - take 5 min because of the Index scan it does. Of course, I have indexes on both tables on attr/num_type_id and I did rebuild all indexes, updated statistics.

    Please, help me to find out why is doing that.

    First BAD case:

    declare @P1 int

    declare @P2 int

    set @P2=4

    declare @P3 int

    set @P3=1

    declare @P4 int

    set @P4=-1

    exec sp_cursoropen @P1 output,

    N'SELECT attribute.attr_id AS attr_id,

    attribute.dsrc_acct_id AS dsrc_acct_id,

    attribute.attr_type_id AS attr_type_id,

    attribute.attr_value AS attr_value,

    attribute.sys_delete_dt AS sys_delete_dt,

    attr_type.attr_type AS attr_type

    FROM attribute, attr_type

    WHERE attribute.attr_type_id = attr_type.attr_type_id

    AND dsrc_acct_id = 7056 AND attr_id <= 11997
    AND (sys_delete_dt IS NULL OR sys_delete_dt > 0)

    ORDER BY attr_type_id, attr_value'

    , @P2 output, @P3 output, @P4 output

    select @P1, @P2, @P3, @P4

    -- Fetch the next 3 lines

    EXEC sp_cursorfetch @P1, 2, 1, 256

    -- Close the cursor

    EXEC sp_cursorclose @P1

    GOOD case:

    declare @P1 int

    declare @P2 int

    set @P2=4

    declare @P3 int

    set @P3=1

    declare @P4 int

    set @P4=-1

    exec sp_cursoropen @P1 output,

    N'SELECT nums.num_id AS num_id,

    nums.dsrc_acct_id AS dsrc_acct_id,

    nums.num_type_id AS num_type_id,

    num_type.num_type AS num_type

    FROM nums, num_type

    WHERE nums.num_type_id = num_type.num_type_id

    AND dsrc_acct_id = 10115 AND num_id <= 0
    AND (sys_delete_dt IS NULL OR sys_delete_dt >0)

    ORDER BY num_type_id, num_value'

    , @P2 output, @P3 output, @P4 output

    select @P1, @P2, @P3, @P4

    -- Fetch the next 3 lines

    EXEC sp_cursorfetch @P1, 2, 1, 256

    -- Close the cursor

    EXEC sp_cursorclose @P1

    Thanks a lot,mj

  • >>with almost the same number of rows in all 4 tables.

    The contents of the rows is more relevant than the number of rows. The tables are *not* the same contents (obvious by the different hard-coded constants in your WHERE), therefore the distribution of data in indexes is not the same, therefore an index that might be highly selective in 1 case and useful for a seek is not selective enough in the other cases and not used. It's data dependant, and we haven't even seen your DDL yet, let alone data samples.

    And then there's the question "What are you trying to do" ? Why are you using a cursor to select into simple scalar variables ?

  • I agree the queries are not the same .. I'd suggest you use showplan etc. to see what indexes etc. are being used and the estmated result set.

    The question is too vague to be precise with an answer - learning how to troubleshoot queries is part of DBA life anyway. Secondary indexes are very selective, typically >90% which may be part of the problem. You may find a clustered idnex or additional secondary indexes(s) may help.

    [font="Comic Sans MS"]The GrumpyOldDBA[/font]
    www.grumpyolddba.co.uk
    http://sqlblogcasts.com/blogs/grumpyolddba/

  • Number of records is not the most significant measure of index performance - it is selectivity of data.  If your attribute table contains only a few different attribute types, while your number table contains many different numbers, the index on attribute will not be very helpful while the one on number will take you very quickly to a small set of data meeting your WHERE clause criteria.  This is most likely the cause of a SEEK in one case but a SCAN in the other.

    Index design is important, too.  When building a concatenated index (index on several fields) performance is much better when the most selective field is first, then less selective, and so on.  For example, assume your business makes 50 products sold by 400 sales people in 5 different regions.  If you have a table of products sold by employee with region code, your index should use employee id first as it cuts your data to a smaller subset than product, and certainly by region.  Your second field would be product, then region.

    Also, distribution of data matters - if your number table holds relatively few rows for most numbers, but say the number 42 has a multitude of records, you could see SEEK when your WHERE includes num = <something other than 42> but a SCAN when WHERE includes num = 42.

    Similarly, from Books on Line:

    Examine data distribution in indexed columns. Often, a long-running query is caused by indexing a column with few unique values, or by performing a join on such a column. This is a fundamental problem with the data and query, and usually cannot be resolved without identifying this situation. For example, a physical telephone directory sorted alphabetically on last name will not expedite locating a person if all people in the city are named Smith or Jones.

    Recommended actions:

    1 - Check your indexes and verify the most selective field is first, then second most selective, etc.

    2 - Check for a concatenated index (one on several fields) that matches the fields in your WHERE clause and any joins.  Concatenated indexes will perform better than several indexes on individual fields.

    3 - Check for skewed data, such as in the examples of 42 or Jones (Sorry, Steve, if you're reading this!) above.

    Hope this helps!

    Carter



    But boss, why must the urgent always take precedence over the important?

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply