We all know indexes are good and I’m hoping everyone knows you can have too many indexes. That means we should be careful when adding new indexes right? So when should we add a new index?
Here are my general rules of thumb, although of course, you should always use your best judgment. Also, this is for OLTP systems (ex: data entry systems) not OLAP systems (ex: data warehouses).
- No Clustered Index: You really should add a clustered index. Clustered indexes are important for a number of reasons, so if you don’t have one spend some time, figure one out and create it. When possible I like dates as a clustered index but of course, it’s highly specific to the table you are working on. There are a few types of tables, some load tables, for example, were not having a clustered index is appropriate but they are few and far between.
- 1-5 indexes: Is this index going to help performance? If so add after running some tests in your dev/test systems.
- 5-10 indexes: At this point, we need to be more careful. Do we have any indexes that aren’t being used? Are there any that can be combined? Once all of this has been done then we might as well add the index. We are starting to have a few too many indexes but it’s not entirely unreasonable.
- 10+ indexes: We are now starting to hit the too many indexes category. Same reviews as above but more urgency about getting rid of unused indexes and add in getting rid of underused indexes (few reads but still getting writes). At this point, I’m also a bit more cautious about adding indexes. It had best be for an important query.
There are a few important questions you may have at this point:
- Finding unused/underused indexes: This is reads vs writes. Have there been any reads on the index? How do the reads compare to the writes? There are a couple of ways to check this. You can directly check sys.dm_db_index_usage_stats or use something like sp_blitzindex. There are also a fair number of canned queries floating around that check reads vs writes.
Unused indexes are ones with 0 reads. Remember, though, that the DMVs get reset for a variety of reasons. You will occasionally run into indexes that show 0 reads because they haven’t been used since that reset. For example, indexes that are only used for month/quarter end queries. These indexes may still be important. It’s best to do this kind of work over time so you can see if the index is truly not used.
Underused indexes are basically the same situation, but you do have some reads, just far more writes than reads.
- Combining indexes: Can any of the indexes be combined? In other words, if have two indexes ix1: (col1, col2) and ix2: (col1) then why do we need ix2 at all? Same with indexes where they are indexed on the same column but with different includes.
A couple of notes: (col1, col2) is not the same as (col2, col1) and shouldn’t be combined. Also if you have an index on (col1) you need to think carefully before combining it with (col1, col2, col3, col4, col5, col6). One of the factors taken into account with indexes is the size of the index. A 4 byte wide index is going to be smaller/faster than a 100 byte wide index. The 4 byte wide index is likely to be picked over the 100 byte one, and for good reason. Although be warned that this is somewhat simplistic. There are other factors involved and you definitely want to test the effect of removing an index that is actually being used.
- Important queries: What exactly is an important query? Well, does this query run 100 times a second at peak times? Then yes, it’s important. How about if it runs once a month and this index will shave an hour off of a 4 hour query? Well, if it’s being run by the CEO, then yes, it’s important. You get the idea.
A couple of other thoughts. If this index isn’t helping then don’t add it. Adding random indexes doesn’t help and it adds to your maintenance costs (reindexes for example).
Also if at all possible (I understand time constraints) do regular reviews of your indexes. Removing unused/underused indexes, combining indexes etc is also an important part of index maintenance.
And lastly remember that these are rules of thumb not rules. Each individual case is just that, individual.
Filed under: Index, Microsoft SQL Server, Performance, SQLServerPedia Syndication Tagged: indexes, microsoft sql server, Performance