October 23, 2006 at 8:46 am
Hi,
I've got a general query that I hope someone could advise on.
I've taken over looking after a server which has a few databases on it. One of the databases has a table in it that had 10 million rows. This table also has 31 non clustered indexes. A process runs each night that inserts and updates about 100,000 records in this table. It takes a long time to run the processing (around 8 hours) and so I've been checking the indexes (amongst other things). Running dbcc show_statistics on the indexes show that they all return very poor Density results (ranging from 0 to 1.27E-05.) I also checked the number of unique records in each column that has an index on it and they range from 3 to 755794 unique values (There is a unique key column that I’ve ignored from the show_statistics because it looks ok).
I’ve always understood that index should only be used on columns that have at the most 90% unique values. I’ve run some testing on a backup of the database and dropped these indexes. The processing ran in 30 mins instead of 8 hours so it would appear to be a huge improvement for the inserts and updates. I also understand that if the query would return more than 10% of a table then the index would be ignored and a table scan used instead. Seeing as all indexes aren’t very selective and return a high number of rows then they would never be used.
Now my question – Is there any reason in having an index on a column that only has at best 10% unique values (that’s a column with 900,000 unique values in a table with 9,000,000 rows)?
Any advice would be appreciated and thank you for your time in reading this.
October 23, 2006 at 9:08 am
Not knowing anything about the application I can't say whether all of the indexes are needed or not. I can say that I have put indexes on fields that are not very unique. In one instance it was a yes/no flag on a 12 million row table. The process was checking the flag and only processesing "no". The majority of the table was "yes", so in my case, instead of doing a table scan and having to scan all of the yes's also, it did an index scan and only processed the no flags. It saved a huge amount of time. My guess in your situation is the last person running sql server read an article that indexes were good and decided to put indexes everywhere. I would run the app against your test database and re-tune the indexes, by first dropping them all and then adding them as needed.
Tom
October 23, 2006 at 9:28 am
Thanks for the reply.
I'm going to try running the app against it shortly. Did you have to tell your query to use the index when you had a yes/no field? I thought that the query would use a table scan if the index returned more than 10% of a table?
October 23, 2006 at 9:31 am
I tend to agree with Tom that there are places where low selectivity can still benefit from indexes. Usually it's in a place where you can still eliminate a large number of values. I've often included low selectivity columns in a compound indeex because it could help limit result sets.
Part of your load issue is the indexing. The other is that 31 indexes is a lot. I'd really dig into being sure that these are needed.
October 23, 2006 at 9:50 am
I have had to put index hints in on occassion. The reason being what other data elements I was bringing back. Bookmark lookups are very expensive to the optimizer, so when you need all of the columns returned, the optimizer will use a table scan on the clustered index to eliminate the bookmark lookup. Depending on the situation and the data elements needed, I will either force an index or create a covering index. Covering indexes are indexes that contain all of the data elements that the query is asking for. A quick example would be an invoice table with the following columns, invoiceid, customer, processed, processeddate, amount, ...
A query selecting sum(amount) from invoice where processed = 'no'
I would build an index on processed and amount, as opposed to just processed. This eliminates the bookmark lookup.
Hope this helps.
Tom
October 24, 2006 at 1:14 am
Thanks for all the replies - I have a better understanding of why some of the indexes are used now. I only have to work out which ones are no longer required.
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply