Index is not used while selecting

  • Hi,

    I have one table which has some 100+ columns. I have created a composite non-clustered index on 3 of the columns those are used frequently in where clause against that table. But when I am running

    select * from table_name

    where column_name1 = 'value1'

    this is not using the index, rather a full table scan is happening. and it is showing that there is a missing index on the column_name1. But this is the column which is there in the composite non-clustered index at the first place.

    I am just guessing, is it because I am writing select * and not selecting few columns???

    Please shade some light on this issue.

    Regards,

    Snigdha

  • Not knowing your exact structure, I'm speculating some here...

    There are two kinds of indexes, clustered and non-clusterd. You get one clustered index per table. If you're getting a table scan, you don't have a clustered index, but instead are dealing with a heap. You have a non-clustered index, but it's only on three columns. If you need to access data other than what is on those three columns, then the best you can expect is a seek against the index followed by a lookup (RID in this case) to get the data from the table. But, from the sounds of things, this column is not the first column of your compound column, but is 2nd or 3rd. If so, it's not going to be recognized as part of the index. You either need to rearrange the index so that this column is first, or you need to create another index with this column on the leading edge.

    This is a huge topic by the way, with lots of details. If you're just getting started, I strongly recommend getting a copy of my book (in the signature below).

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • snigdhandream (2/2/2012)


    Hi,

    I have one table which has some 100+ columns. I have created a composite non-clustered index on 3 of the columns those are used frequently in where clause against that table. But when I am running

    select * from table_name

    where column_name1 = 'value1'

    this is not using the index, rather a full table scan is happening. and it is showing that there is a missing index on the column_name1. But this is the column which is there in the composite non-clustered index at the first place.

    I am just guessing, is it because I am writing select * and not selecting few columns???

    Please shade some light on this issue.

    Assuming column_name1 is the leading column on the index, there are a couple of reasons why SQL Server would decide not to use it.

    Have you gathered stats on the index?... if no, SQL Server would assume standard distribution of values.

    How many distinct values are for column_name1?

    How many rows there are in the target table?

    How many rows in the table match column_name1 = 'value1'?

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • snigdhandream (2/2/2012)


    I have created a composite non-clustered index on 3 of the columns those are used frequently in where clause against that table.

    You might want to take a read through this: http://sqlinthewild.co.za/index.php/2009/01/19/index-columns-selectivity-and-equality-predicates/

    Unless that index starts with the column column_name1, the index is useless for that query.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • It would help if you can show us the execution plan. But your guess if half right and Grant as explained the other potential problem. If column1 is not the first column in your composite index then it will lead to index scans.

    Assuming both the assumptions are correct here's what is happening.

    1) You are selecting columns that are not part of your index. So it will have to do a lookup.

    2) The column1 is not the first column (leading column) in your index. This will lead to index scan on that index.

    The optimizer is calculating that of doing a table scan is better and faster than an index scan + later a lookup.

    To fix this firstly figure out how many columns you want to select. Then create an appropriate index on the column1 which includes those selected columns.

    Blog
    http://saveadba.blogspot.com/

  • Hi everybody,

    Thanks for your replies. I will make the scenario little bit clear now. The column_name1 is the leading column in the non-clustered composite index. The table has around 7000 rows.

    I also checked that when the result set returns some 800 or 900 rows, it is taking the full table scan and when it returns some 10/12 rows, it is hitting the index. This thing is maintained by the DB engine I understand. Somehow my problem is, I need to improve the query performance. But in this way the performance is not improved. For some, the table scan is faster (which was happening earlier) and for some the index seek is faster. But it depends on the value provided in the where clause. When my client is checking, he is getting the average result not so improved. Any suggestions?

    Regards,

    Snigdha

  • When you're returning 1/7 of the table, just scanning it makes more sense for the optimizer. I suspect that's the primary issue, but without details, we're still stabbing in the dark here.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • If the stats are updated and the index contains all the columns that are selected and the column in the where clause is the leading column in the index then you have taken care of everything that is required to ensure index seeks.

    But in spite of that if the optimizer thinks it it better to do a table scans under certain conditions then I think it is making the right choice.

    How about you check the query timing by providing a query hint to use the index? Check if that gives you good run times on an average.

    I do not recommend that you give index hint because optimizer knows better but nothing wrong in testing.

    Blog
    http://saveadba.blogspot.com/

  • snigdhandream (2/2/2012)


    Thanks for your replies. I will make the scenario little bit clear now. The column_name1 is the leading column in the non-clustered composite index. The table has around 7000 rows.

    I also checked that when the result set returns some 800 or 900 rows, it is taking the full table scan and when it returns some 10/12 rows, it is hitting the index. This thing is maintained by the DB engine I understand. Somehow my problem is, I need to improve the query performance. But in this way the performance is not improved. For some, the table scan is faster (which was happening earlier) and for some the index seek is faster. But it depends on the value provided in the where clause. When my client is checking, he is getting the average result not so improved. Any suggestions?

    Assuming that query cannot be improved by adding a second (and third) column to the predicate and having a new index to serve it... you can always force the data into the buffer cache by forcing a full table scan every now and then - frequency will depend on particular needs.

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • snigdhandream (2/2/2012)


    Hi everybody,

    Thanks for your replies. I will make the scenario little bit clear now. The column_name1 is the leading column in the non-clustered composite index. The table has around 7000 rows.

    I also checked that when the result set returns some 800 or 900 rows, it is taking the full table scan and when it returns some 10/12 rows, it is hitting the index.

    Stop using select *?

    This makes perfect sense to me. When it needs a very limited # of rows, it performs a seek on the index then performs another seek against the primary table. When you're pulling back 1/7 of the table, it just goes to the source in the first place.

    If you want that index to (almost) always be used, it needs to be covering, which means it needs to include all the columns being returned (see the INCLUDE component of CREATE INDEX) as well as what's being seeked on. That index will still scan at 1/7th of the table, but you'll restrict the data it needs to review.

    However, if you're returning a SELECT *, you're always going to be returning the clustered index's volume of data. You're pretty much hosed here without some redesign around the business requirements. A simple index can't save you in this case, it's just too much data on average.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • This column has some 350 distinct values among 7000 rows in total. Is it a good candidate for creating a index on it? It also has around 3000 NULL values.

    Regards,

    Snigdha

  • Nope. That's not very selective at all. The recommended practices are that the data be about 1% selective, meaning 1 distinct value out of 100 rows. In practice it's usually even more stringent. You're looking at about 5% if we include the null values. Exclusing those and the results get worse. This just doesn't sound like good data for an index.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Thank you Grant for the prompt response.

    Regards,

    Snigdha

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply