SHOWCONTIG and Index Fill Factor

  • I am working on optimizing a query and I believe I have done as much as I can on the query itself... at least, with my current level of knowledge; it is time to look at SQL Server and what I can do with indexes, stats, etc. I have the query running at 1 min 8 seconds (down from 4 min 44 seconds). Of course, the boss wants "all queries" to be running at 5 seconds or less (yeah, right!).

    I was looking at one of the clustered indexes and noticed the default fill factor was set to default to 100%:

    DBCC SHOWCONTIG scanning 'CustTable_Apr08' table...

    Table: 'CustTable_Apr08' (957558795); index ID: 1, database ID: 20

    TABLE level scan performed.

    - Pages Scanned................................: 87387

    - Extents Scanned..............................: 11060

    - Extent Switches..............................: 11246

    - Avg. Pages per Extent........................: 7.9

    - Scan Density [Best Count:Actual Count].......: 97.13% [10924:11247]

    - Logical Scan Fragmentation ..................: 1.87%

    - Extent Scan Fragmentation ...................: 15.05%

    - Avg. Bytes Free per Page.....................: 105.1

    - Avg. Page Density (full).....................: 98.70%

    I modified the fill factor on the index and set it to 90%, then re-ran showcontig:

    DBCC SHOWCONTIG scanning 'CustTable_Apr08' table...

    Table: 'CustTable_Apr08' (957558795); index ID: 1, database ID: 20

    TABLE level scan performed.

    - Pages Scanned................................: 95008

    - Extents Scanned..............................: 11876

    - Extent Switches..............................: 11875

    - Avg. Pages per Extent........................: 8.0

    - Scan Density [Best Count:Actual Count].......: 100.00% [11876:11876]

    - Logical Scan Fragmentation ..................: 0.00%

    - Extent Scan Fragmentation ...................: 0.24%

    - Avg. Bytes Free per Page.....................: 746.1

    - Avg. Page Density (full).....................: 90.78%

    While the fragmentation has gone down, the pages scanned has gone up, as has the extents scanned. I think I just exchanged one problem for another here?

    Before I write a query to modify all of the fill factors on the partitioned tables, I would appreciate any input or feedback on the above showcontig results. 🙂

    SQL Server 2005 SP2 Std Edition

    This particular database is roughtly 350 GBytes

    Normally, the DB would received inserts weekly, but this particular database receives updates for about 2 weeks every quarter, than stays relatively stagnant. When the database is updated, it received an entire quarters worth of data, which accounts for about 150 GBytes of new records. If any new data is put into the above DB, it might just have new data inserted in the "Apr08" table...

    A similar database (not as large that has weekly inserts) with the same index looks looks as follows... however, the Apr08 table is probably done with any new data inserts:

    DBCC SHOWCONTIG scanning 'CustTable_Apr08' table...

    Table: 'CustTable_Apr08' (638625318); index ID: 1, database ID: 43

    TABLE level scan performed.

    - Pages Scanned................................: 3277

    - Extents Scanned..............................: 416

    - Extent Switches..............................: 415

    - Avg. Pages per Extent........................: 7.9

    - Scan Density [Best Count:Actual Count].......: 98.56% [410:416]

    - Logical Scan Fragmentation ..................: 1.25%

    - Extent Scan Fragmentation ...................: 0.96%

    - Avg. Bytes Free per Page.....................: 60.3

    - Avg. Page Density (full).....................: 99.25%

    Argue for your limitations, and sure enough they're yours (Richard Bach, Illusions)

  • Actually when you changed the fill factor of the index, you effectively rebuilt it. The reason why you're seeing a higher pages scanned count is simply a factor that the page leaves are more tightly compacted together as a result of the rebuild of the index. The denser the pages are compacted together, the deeper the scan density, the more pages get scanned by the DBCC SHOWCONTIG command.

    Changing the fill factor should only be done if you're seeing a lot of page splitting and fragmentation levels rising over a short period of time, caused by insert, update, and delete operations. If you see page splits going up, that means the average page fullness is insufficient to hold a new or modified index value, which leads to decreased performance for write-based operations. So yes, in effect, as you say, you may be trading one issue for another. The only way to be sure is to watch your performance counters.

    Having said all that, you really need to use the new dynamic management function sys.dm_db_index_physical_stats to look at index fragmentation of a particular index. The older DBCC SHOWCONTIG command is not only depricated, but some documentation actually indicates that it may provide false readings for fragmentation in the SQL Server environment. Books Online sates that...

    Unlike DBCC SHOWCONTIG, the fragmentation calculation algorithms in both cases consider storage that spans multiple files and, therefore, are accurate.

  • Yes - use sys.dm_db_index_physical_stats

    something like

    select Object_Name(dmv.Object_ID) as TableName

    ,si.Name as Index_Name

    ,Avg_Fragmentation_In_Percent

    ,Fragment_Count

    ,Avg_Page_Space_Used_In_Percent

    ,Record_Count

    --*

    From sys.dm_db_index_physical_stats (

    db_id()

    ,Null

    ,Null

    ,Null

    ,'Sampled'

    ) dmv

    join sys.indexes si

    on si.object_id = dmv.object_id

    and si.Index_ID = dmv.Index_ID

    Where is_Disabled = 0

    and dmv.Record_Count >= 100000

    and Fragment_Count > 10

    And either filter for your tables index(es) on the where clause or in the function call with the second parameter.

    However, I would tend to doubt that fillfactor is the primary performance problem in your query. Have you looked at the actual execution plan?

  • Thanks for the info.... started looking into DMV(s) and I noted the comment in the BOL regarding the SHOWCONTIG command. Still learning (thrown into DBA position here last year).

    I read a few SQLServerCentral articles, found one on indexes and fill so I thought I might try it.

    As for execution plan, I used (and am using) the "Include Actual Execution Plan" option in MSSMS. Ran the DTA last night to see what suggestions it might come up with... one new index suggestion (I dont *THINK* so!!!) and a few statistics for 7% improvement.

    Need to look more into the query itself. Unfortunately, for this customer the query spans 4 months of data and returns (in one test case) 55k records with 2-3 joins. Looking into correlated datetime columns feature right now, since most of the tables and queries are very date specific. 🙂

    Rebuilding/re-organizing the queries over the weekend (using a 30% defrag point for rebuild/reorganize decision) definitely had a huge impact on the query, as well as modification on some of the joins and join conditions (adding date to the condition) helped in some cases. 🙂

    One table join is accounting for 50% of the overhead in the query. No worries... just need to work in this more!

    Thanks again for the inputs! 🙂

    Argue for your limitations, and sure enough they're yours (Richard Bach, Illusions)

  • Post the query here if you want advice.

    Be careful of the tuning advisor's suggestions. It frequently suggests far more than is necessary.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Ok... here is the query:

    SELECT

    LO.Territory,

    IV.AccountNumber,

    CH.InvoiceItem, CH.InvoiceNumber,

    CH.NetAmount,

    CH.IncentiveAmount,

    AD.[Name], AD.Company, AD.AddrLine1, AD.AddrLine2,

    AD.City + N', ' + AD.State + N' ' + AD.PostalCode AS ConsigneeCityLine

    FROM VIEW_CustTable CH --WITH(NOLOCK)

    INNER JOIN MainInvoiceTbl IV --WITH(NOLOCK)

    ON CH.InvoiceNumber = IV.InvoiceNumber AND CH.InvoiceDate = IV.InvoiceDate

    LEFT OUTER JOIN StateDtl LO --WITH(NOLOCK)

    ON LO.InvoiceItem = CH.InvoiceItem

    LEFT OUTER JOIN VIEW_AllAddresses AD --WITH(NOLOCK)

    ON CH.InvoiceDate = AD.InvoiceDate AND

    CH.InvoiceNumber = AD.InvoiceNumber AND

    AD.InvoiceItem = CH.InvoiceItem

    AND AD.AddrType = '2' ---- Customer Addr only

    WHERE IV.InvoiceDate Between '04/01/2008' And '08/15/2008'

    And (IV.CustNumber Like '%12345' OR IV.CustNumber Like '%23456')

    AND (LO.Territory IS NULL AND 'CALIFORNIA' = '**TERRITORY NOT DEFINED **') OR LO.Territory = 'CALIFORNIA'

    AND CH.ChargeCatDtlCode = 'ADJUST'

    AND CH.InvoiceItem NOT IN (SELECT InvoiceItem from VIEW_CustTable CH

    where (ChargeDetailCode = 'ACCT'

    AND ChargeCatDtlCode IN ('ERROR','VOID')) )

    --AND AD.AddrType = '2'

    ORDER BY AD.Company DESC

    The problem, as indicated before, appears to be with the VIEW_AllAddresses join/table(s).

    Some information:

    VIEWS are unions of partitioned tables... i.e., VIEW_AllAddresses would be Select * from AllAddresses_Jan08 UNION SELECT * from AllAddresses_Feb08...

    Index on each AllAddresses table is a unique clustered index (IX_Index_AllAddresses_Jan08, etc.) consisting of InvoiceDate, InvoiceNumber, InvoiceItem,AddrType (in that order).

    Each Invoice/InvoiceItem in AllAddresses can contain multiple AddrTypes (1,2,3...8) but there can be InvoiceItems in CustTable but not in AllAddresses.

    The biggest problem I am seeing right now is the use of IX_Index_AllAddresses_xxxxx index(es) to fetch the address records, but it is going against each AllAddresses table (Jan08, Feb08, Mar08, etc.). For example:

    Argue for your limitations, and sure enough they're yours (Richard Bach, Illusions)

  • Here is the rest of the post, part of the execution plan:

    [/RelOp]

    [RelOp AvgRowSize="75" EstimateCPU="0.0001581" EstimateIO="0.003125" EstimateRebinds="3512.44" EstimateRewinds="0" EstimateRows="1" LogicalOp="Clustered Index Seek" NodeId="85" Parallel="false" PhysicalOp="Clustered Index Seek" EstimatedTotalSubtreeCost="5.80676"]

    [OutputList]

    [ColumnReference Database="[MyDB]" Schema="[dbo]" Table="[AllAddresses_Mar08]" Column="Name" /]

    [ColumnReference Database="[MyDB]" Schema="[dbo]" Table="[AllAddresses_Mar08]" Column="Company" /]

    [ColumnReference Database="[MyDB]" Schema="[dbo]" Table="[AllAddresses_Mar08]" Column="AddrLine1" /]

    [ColumnReference Database="[MyDB]" Schema="[dbo]" Table="[AllAddresses_Mar08]" Column="AddrLine2" /]

    [ColumnReference Database="[MyDB]" Schema="[dbo]" Table="[AllAddresses_Mar08]" Column="City" /]

    [ColumnReference Database="[MyDB]" Schema="[dbo]" Table="[AllAddresses_Mar08]" Column="State" /]

    [ColumnReference Database="[MyDB]" Schema="[dbo]" Table="[AllAddresses_Mar08]" Column="PostalCode" /]

    [/OutputList]

    [RunTimeInformation]

    [RunTimeCountersPerThread Thread="0" ActualRows="0" ActualEndOfScans="55114" ActualExecutions="55114" /]

    [/RunTimeInformation]

    [IndexScan Ordered="true" ScanDirection="FORWARD" ForcedIndex="false" NoExpandHint="false"]

    [DefinedValues]

    [DefinedValue]

    [ColumnReference Database="[MyDB]" Schema="[dbo]" Table="[AllAddresses_Mar08]" Column="Name" /]

    [/DefinedValue]

    [DefinedValue]

    [ColumnReference Database="[MyDB]" Schema="[dbo]" Table="[AllAddresses_Mar08]" Column="Company" /]

    [/DefinedValue]

    [DefinedValue]

    [ColumnReference Database="[MyDB]" Schema="[dbo]" Table="[AllAddresses_Mar08]" Column="AddrLine1" /]

    [/DefinedValue]

    [DefinedValue]

    [ColumnReference Database="[MyDB]" Schema="[dbo]" Table="[AllAddresses_Mar08]" Column="AddrLine2" /]

    [/DefinedValue]

    [DefinedValue]

    [ColumnReference Database="[MyDB]" Schema="[dbo]" Table="[AllAddresses_Mar08]" Column="City" /]

    [/DefinedValue]

    [DefinedValue]

    [ColumnReference Database="[MyDB]" Schema="[dbo]" Table="[AllAddresses_Mar08]" Column="State" /]

    [/DefinedValue]

    [DefinedValue]

    [ColumnReference Database="[MyDB]" Schema="[dbo]" Table="[AllAddresses_Mar08]" Column="PostalCode" /]

    [/DefinedValue]

    [/DefinedValues]

    [Object Database="[MyDB]" Schema="[dbo]" Table="[AllAddresses_Mar08]" Index="[PK_AllAddresses_Mar08_Addr]" /]

    [SeekPredicates]

    [SeekPredicate]

    [Prefix ScanType="EQ"]

    [RangeColumns]

    [ColumnReference Database="[MyDB]" Schema="[dbo]" Table="[AllAddresses_Mar08]" Column="InvoiceDate" /]

    [ColumnReference Database="[MyDB]" Schema="[dbo]" Table="[AllAddresses_Mar08]" Column="InvoiceNumber" /]

    [ColumnReference Database="[MyDB]" Schema="[dbo]" Table="[AllAddresses_Mar08]" Column="InvoiceItem" /]

    [ColumnReference Database="[MyDB]" Schema="[dbo]" Table="[AllAddresses_Mar08]" Column="AddrType" /]

    [/RangeColumns]

    [RangeExpressions]

    [ScalarOperator ScalarString="[Union1032]"]

    [Identifier]

    [ColumnReference Column="Union1032" /]

    [/Identifier]

    [/ScalarOperator]

    [ScalarOperator ScalarString="[Union1033]"]

    [Identifier]

    [ColumnReference Column="Union1033" /]

    [/Identifier]

    [/ScalarOperator]

    [ScalarOperator ScalarString="[Union1035]"]

    [Identifier]

    [ColumnReference Column="Union1035" /]

    [/Identifier]

    [/ScalarOperator]

    [ScalarOperator ScalarString="'2'"]

    [Const ConstValue="'2'" /]

    [/ScalarOperator]

    [/RangeExpressions]

    [/Prefix]

    [/SeekPredicate]

    [/SeekPredicates]

    [/IndexScan]

    [/RelOp]

    Argue for your limitations, and sure enough they're yours (Richard Bach, Illusions)

  • DB_Newbie2007 (8/20/2008)


    Some information:

    VIEWS are unions of partitioned tables... i.e., VIEW_AllAddresses would be Select * from AllAddresses_Jan08 UNION SELECT * from AllAddresses_Feb08...

    First thing that comes to mind... If there's no chance of overlap between those partitioned tables (ie, rows in one won't also be in another) change your UNION to UNION ALL

    Unions forces a sort/distinct sort which can be very expensive on larger row sets.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • DB_Newbie2007 (8/20/2008)


    Here is the rest of the post, part of the execution plan:

    Any chance you could zip the .showplan file and attach it to your post? It's easier to look at the plan in graphical form than to read through the xml.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • My bad... they are UNION ALL in the Views.

    Argue for your limitations, and sure enough they're yours (Richard Bach, Illusions)

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply