Index Question

  • That makes sense! Let me give it a go. Thanks to all for your input. 🙂

  • Based on the IO statistics, it appears that the table has very few rows which causes the optimal SQL query plan to be a table scan via the clustered index. This is reflected in the statistics io output which shows more logical reads when using a non-clustered index.

    I note that all of the columns except the identity are define as NULL, which indicates a problem with the data model design that needs to be resolved.

    I have two recommendations:

    1. Populate the table with more data.

    2. Change the data model.

    Changing the data model includes:

    1. Drop the column [ValidationExceptionLogId] (the IDENTITY)

    2. Identify the columns that are required and set the "not null" property.

    3. Identify the combination of required columns that are unique and define a primary key constraint on these columns with the clustered option.

    Using expected data, run this SQL top get the count of distinct values for the columns:

    select 'ValidationExceptionId' , count(distinct ValidationExceptionId) from ValidationExceptionLog

    union all

    select 'ValuationPoint' , count(distinct ValuationPoint) from ValidationExceptionLog

    union all

    select 'ValuationPointDate' , count(distinct ValuationPointDate) from ValidationExceptionLog

    If the primary key columns include at least one of where columns, then define the primary key columns in order with the where column with the most distinct values as the first column.

    For example, if the primary key columns include [ValuationPointDate] , [ValuationPoint] but not [ValidationExceptionId] and column [ValuationPointDate] has the most distinct values followed by [ValuationPoint], then define the primary key columns as ([ValuationPointDate] , [ValuationPoint], {other primary key columns} )

    If none of the columns within the where clause are columns within the primary key, only then should a secondary index should be defined.

    SQL = Scarcely Qualifies as a Language

  • Carl, I did exactly as you said, and I can see the improvement. I now have a Clustered Index on ValuationPoint, ValuationPointDate and ValidationExceptionId in that ordered and I have removed all other indeces. The execution plan shows a Clustered Index seek now and the logical reads are almost half the count now. Thanks

    Would this type of Index affect the performance of inserts?

  • It will affect inserts, but how much will depend on a couple of things, including how often you defragment the clustered index, how you set the fill factor on the index, how big individual rows are (compared to page size), and how frequent inserts are.

    Most likely, if you maintain the index even reasonably well, your inserts should be just fine.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • "Would this type of Index affect the performance of inserts?"

    As Gail Shaw indicated, there are many factors to answer this question and two that should be answered first are:

    1) Will the value of the primary key for new rows always be greater than any value in the table?

    2) Will there be updates that increase the row size?

    When Yes and No, (append inserts and no row size increases), then the clustered index fill factor should be set to 100% (no free space), which will pack the rows into the least amount of space. The low space utilization for the same amount of data results in less physical IO and less memory utilization.

    When other than Yes and No: (random inserts OR row size increases), then the clustered index fill factor needs to be determined and may need to be greater than the default of 90%.

    If there are any other tables within the same filegroup, significant extent fragmentation can occur, which will affect performance. If performance is critical for this table, consider creating a dedicated filegroup for just this table. Having the filegroup consist of multiple files can also increase thruput and increase the thruput of parallelism.

    For non-clustered indexes, the same two questions need to be ansered, the fill factor set appropriately, and a dedicated filegroup may be needed.

    SQL = Scarcely Qualifies as a Language

Viewing 5 posts - 16 through 19 (of 19 total)

You must be logged in to reply to this topic. Login to reply