optimal indexes for a table that is read frequently

  • Hi, we have a history table in our system that we insert records into it once every 2 weeks but we query the table almost every day. So, we perform more reads on that table than writes. Recently, when we are inserting records into the table, it is timing out. we are thinking it is because of having too many indexes. So, I was trying to consolidate those indexes into few and have some questions that I am hoping you might help.

    Some of the columns in that table are company, empno, batch, batch date, check date, charge date, post type, post code, check number etc.

    Sometimes the users would query that table using filter company, empno. Sometimes they query using filter company, batch. Some times company, batch date.

    Is it better to create one index that has all these columns and sql server would be able to use the index efficiently based on what we specify in the where clause? For example, If I create index on Company, Empno, Batch, BatchDate, CheckDate and if I have queries as

    select * from table where company = @company and Empno = @Empno

    select * from table where company = @company and Batch = @Batch

    Will these two queries use the same index efficiently or Is it better to create separate indexes (one by empno, one by batch)?

    Thanks in advance.

  • Sridhar-137443 (1/6/2014)


    Hi, we have a history table in our system that we insert records into it once every 2 weeks but we query the table almost every day. So, we perform more reads on that table than writes. Recently, when we are inserting records into the table, it is timing out. we are thinking it is because of having too many indexes. So, I was trying to consolidate those indexes into few and have some questions that I am hoping you might help.

    Some of the columns in that table are company, empno, batch, batch date, check date, charge date, post type, post code, check number etc.

    Sometimes the users would query that table using filter company, empno. Sometimes they query using filter company, batch. Some times company, batch date.

    Is it better to create one index that has all these columns and sql server would be able to use the index efficiently based on what we specify in the where clause? For example, If I create index on Company, Empno, Batch, BatchDate, CheckDate and if I have queries as

    select * from table where company = @company and Empno = @Empno

    select * from table where company = @company and Batch = @Batch

    Will these two queries use the same index efficiently or Is it better to create separate indexes (one by empno, one by batch)?

    Thanks in advance.

    This is a pretty involved question. The answer (as you probably expected) is "it depends".

    The first thing I noticed is that you're using SELECT * instead of selecting only the columns you need. For the non-filter fields you need returned, include them in the index by using INCLUDE when creating the index. Your goal is to make the index cover the query.

    Next, the company column appears to be the common column used. If this is indeed the case, that should be the left-most column of the index. In theory, the similar queries will be able to benefit from the one index. However, you're going to have to test your different query variations to make sure the performance is there. If you have so many variations that it becomes impractical to cover them all, Gail has an excellent article on catch-all queries at http://sqlinthewild.co.za/index.php/2009/03/19/catch-all-queries/.

    Lastly, when you load your data, make sure your indexes don't become fragmented. Also make sure your statistics are updated. This will help the query optimizer make the best decisions when querying the table.

    Above all, I cannot overemphasize the importance of testing. Test, test, test whatever you come up with.

  • Thank you for the reply. I only specified select * here because I didn't want to type all column names. we usually specify column names.

  • Sridhar-137443 (1/6/2014)


    Hi, we have a history table in our system that we insert records into it once every 2 weeks but we query the table almost every day. So, we perform more reads on that table than writes. Recently, when we are inserting records into the table, it is timing out. we are thinking it is because of having too many indexes. So, I was trying to consolidate those indexes into few and have some questions that I am hoping you might help.

    Some of the columns in that table are company, empno, batch, batch date, check date, charge date, post type, post code, check number etc.

    Sometimes the users would query that table using filter company, empno. Sometimes they query using filter company, batch. Some times company, batch date.

    Is it better to create one index that has all these columns and sql server would be able to use the index efficiently based on what we specify in the where clause? For example, If I create index on Company, Empno, Batch, BatchDate, CheckDate and if I have queries as

    select * from table where company = @company and Empno = @Empno

    select * from table where company = @company and Batch = @Batch

    Will these two queries use the same index efficiently or Is it better to create separate indexes (one by empno, one by batch)?

    Thanks in advance.

    Indexes may or may not be the problem. Unless you've absolutely confirmed that indexes are the cause of the timeouts, I wouldn't do anything with indexes, yet.

    1. What column(s) is the clustered index on and what are their datatypes?

    2. For the inserts over time, which columns would describe the order of inserts?

    3. How many distinct companies are there in this table?

    4. How many rows are in the table?

    5. How many rows are in a typical insert?

    6. Are there any triggers, indexed views, of FKs associated with this table?

    7. How many indexes do you actually have on the table?

    8. Other than the ones that starte with "Company", do you have any other low cardinality indexes? If so, what is their definition?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi, Thank you for the response. Please see my answers

    1. What column(s) is the clustered index on and what are their datatypes?

    we have a column named "Uniqueid". It is datatype int. we have primary key on that column

    2. For the inserts over time, which columns would describe the order of inserts?

    We have clustered index on uniqueid and it is identity field.

    3. How many distinct companies are there in this table?

    It depends on the client database. Most of the clients have one company. some clients have 2 or 3 companies. One client has 20 companies.

    4. How many rows are in the table?

    2019332

    5. How many rows are in a typical insert?

    In this database, it is usually around 6000 rows.

    6. Are there any triggers, indexed views, of FKs associated with this table?

    No

    7. How many indexes do you actually have on the table?

    we have 17 indexes on the table

    8. Other than the ones that starte with "Company", do you have any other low cardinality indexes? If so, what is their definition?

    every time we insert, we have a batch number. So, all those 6000 records have same batch. The next low cardinality is post type.

  • Based on what you've said so far, I assume you've identified the queries that are running the longest and/or are most frequently called. Have you then looked at the execution plans for the these queries to understand how the optimizer is resolving them? You say you have 17 indexes in place (possibly a very high number of indexes), are those indexes getting used appropriately in support of your queries? Are you queries written in a way that they can take advantage of the indexes you already have in place?

    Considering the number of indexes you already have, before you start adding more or rearranging the ones you have, understand how they are used, if they are used. That knowledge will guide you better to make decisions on changes.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Sridhar-137443 (1/6/2014)


    Hi, Thank you for the response. Please see my answers

    1. What column(s) is the clustered index on and what are their datatypes?

    we have a column named "Uniqueid". It is datatype int. we have primary key on that column

    2. For the inserts over time, which columns would describe the order of inserts?

    We have clustered index on uniqueid and it is identity field.

    3. How many distinct companies are there in this table?

    It depends on the client database. Most of the clients have one company. some clients have 2 or 3 companies. One client has 20 companies.

    4. How many rows are in the table?

    2019332

    5. How many rows are in a typical insert?

    In this database, it is usually around 6000 rows.

    6. Are there any triggers, indexed views, of FKs associated with this table?

    No

    7. How many indexes do you actually have on the table?

    we have 17 indexes on the table

    8. Other than the ones that starte with "Company", do you have any other low cardinality indexes? If so, what is their definition?

    every time we insert, we have a batch number. So, all those 6000 records have same batch. The next low cardinality is post type.

    Considering the extremely low selectivity (cardinality) of the Company column, Batch Number column, and Post Type, I can actually see a timeout occuring during the extent splits that are occuring on indexes that use those columns as the "left most" column in the index. It might be worthwhile to disable those indexes (all of which would be NCIs according to what you've described as the PK) during the insert and rebuild them after the insert. If you have the Enterprise Edition and there are no blobs in the indexes as INCLUDEs, you could rebuild the indexes using the ONLINE option. Be aware that index rebuilds are fully logged in the FULL recovery model. You might want to make a brief excursion to the BULK-LOGGED model with the additional understanding that any Point-in-Time log file backups that occur during such a minimally logged action will require an all or "stop" load if a restore is ever required.

    The good part would be that you wouldn't have to worry about REBUILDing the indexes after that.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I checked our sql server edition. It is standard edition. So, I don't have option of rebuilding indexes with ONLINE option. If I disable the indexes and enable the indexes after the insert, does it help? We are using this table in so many places. It would be hard for me to figure what queries are using that table and look at the execution plan.

    Please let me know your suggestions.

  • This will give you information about which indexes are used. Won't tell you what indexes you need but might tell you a few that you don't. Replace TABLENAME beow with your table name. it can also help you figure out what query is using what index without having to look at execution plans if you run the quer4y and check which index usage counts increased.

    SELECT

    i.name AS Index_Name

    ,i.type_desc AS Index_Type

    ,iu.user_seeks

    ,iu.user_scans

    ,iu.last_user_seek

    ,iu.last_user_scan

    from sys.dm_db_index_usage_stats iu

    inner join sys.indexes i ON i.object_id = iu.object_id AND i.index_id = iu.index_id

    where iu.object_id = object_id('TABLENAME')

  • Sridhar-137443 (1/7/2014)


    I checked our sql server edition. It is standard edition. So, I don't have option of rebuilding indexes with ONLINE option. If I disable the indexes and enable the indexes after the insert, does it help?

    Not so far as rebuilding ONLINE.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply