General Index Questions

  • Does a composite index serve a query well that is only using one of the fields in the index?  Should a separate index be created?

    When a table is truncated and re-populated, what happens to the indexes?

    When a table is modified with deletes, inserts and updates, what happens to the indexes?  If these operations occur daily in a DTS package, should one drop the indexes and re-create them?  At what point?  Why?  Is the logic different for my PK vs. unique/non-clustered vs. non-clustered?  I would like to hang on to PK's or uniques while inserting to fail on dupes.  Am I taking a performance hit on this? 

    Where can I find a primer on managing indexes?

    [font="Courier New"]ZenDada[/font]

  • I can't really help you with a primer. My knowledge is from a wide variety of sources over time.

    First, keep the PK composite or not. Performance tweaks should never compromise data integrety. There are other ways.

    When a table is truncated all pages for the table and its index(es) are released. When population begins new pages are allocated as needed. If the table is clustered the re-population could cause imediate page fragmentation depending upon the order you are inserting the data.

    When you delete rows it creates holes in the index(es) and in the data pages. As you insert new rows, those holes can be re-used depending upon the row data fitting with the sort of the index and the sort on the clustering index if present. Updates are in in-place operation and generally a non-impact on the index(es). Exceptions are that if the index columns themselves are updated it may force them to be moved to a different location in the index which is basically an atomic delete/insert. Also, if an index column is variable length and is updated such that the new value no longer fits on the page (not enough free space remaining to make room) it will force a page split.

    As of SQL Server 2000 I don't know of any reason to ever explicitely drop/create indexes for daily maintenance. However, if the repopulation process causes a lot of page fragmentation (DBCC SHOWCONTIG) you will either want to DBCC INDEXDEFRAG or DBCC REINDEX.

    BOL for one of the above DBCC commands has a script that will only defrag/reindex based on how fragmented the index/table is. If the tables in question are small you may want to just unconditionally reindex/defrag them rather than spend the time assessing status since it is rebuilt every day by a batch process.

    PK is basically just shorthand for a unique index. In human terms it is the primary unique index if more than one exists. Internally it is handled no different than any other unique key for your purposes.

    Clustering impacts that data pages more than the index pages. You can't defragment data pages unless there is a clustering index (or you cludge some manual process together). Following is a good artical I keep in my virtual briefcase from this site regarding clustering.

    http://www.sqlservercentral.com/columnists/chedgate/clusterthatindex.asp

    Now for the first question. There are two primary advantages to indexes. First, they provide a sorting mechanism that allows SQL Server to "seek" pertanant rows instead of "scan"ing for them. The later requires SQL Server to walk the entire index, the former allows SQL Server to know at some point that continuing through the index is pointless. Sort order is the primary driver for that knowledge. So, if your where clause does not reference the first column in an index then SQL Server will need to scan the index because the column being used is effectively not sorted in that instance. If you do provide a condition for the first column in the index in addition to the second column then SQL Server can seek out the relavent row(s). If you must perform a scan (the first column can't be known versus the coder was lazy) then a new index minus the first column will allow a seek. However, every index you add creates overhead for INSERT, UPDATE, and DELETE operations and induces additional data storage requirements. If the table is only modified at night in a batch process and is used read-only during the day then additional indexes are more warrented and easier to justify. If the batch process is extremely intensive and performance becomes a problem at night you could (non ideal) drop the non-unique indexes, modify the table, then re-create the non-unique indexes.

    Based on the above, take care of the ORDER in which you define index columns. It often matters significantly.

  • Thank you, Aaron!  I will share this with my team!

    [font="Courier New"]ZenDada[/font]

  • As for the question:

    Does a composite index serve a query well that is only using one of the fields in the index?  Should a separate index be created?

     

    The answer is maybe.

    If the item is your clustered index you may potentially want to rethink especially if the composite data width is great, the reason for this is every non-clustered index references the clustered index by value, so you end up causing your non-clustered indexes large growth with large composite clustered indexes.

    In a composite index (it is my understanding from MS and I don't think has changed) only the first columns statistics are stored, so if you include say a tinyint column with limited values as the first column then your index may be more often avoid than used. Always in a composite index place the most unique column first, then it doesn't really mater on the rest.

    Now if the one field be used is the first column in the index it will be bennificial as it will be able to perform a seek in the index, if not it will tend to opt for an index scan which is slower performance overall. If this is the case you may want to create seperate indexes, especially if you are not using but one of the columns at any given time. That said if this is a unique constraint or index you may want to keep the composite index and create another index for any column involved that is not the first column especiialy when that column is used more often for queries. The downside there thou is it will be increase in time on inserts, deletes and if column in index or clustered index is altered by an update it will be processed as well.

    So again the answer is maybe.

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply