nightly sp_updatestats versus weekly Update Statistics with fullscan

  • We've been using one of the stored procedures that finds indexes needing a statistics update and then updates them with fullscan ( only where the number of modified rows was above 20% ).

    Sometimes this didn't get everything done. So I added a nightly run of sp_updatestats. Every morning, the query below shows all stats up to date. My partner's concern is that the fullscan weekend job will skip work we really want done, due to the nightly sp_updatestats modifying either the rowmodctr or the last update date. For now I've dropped the 20% row modification criteria and disabled the nightly job. Indexes with the most modified rows get processed first by the weekend fullscan job and it is limited to 3.5 hours runtime.

    Indexes are rebuilt or reorganized weekly by a separate job based upon fragmentation. Even though that job will update stats on those it rebuilds, this question is about the jobs that specifically update stats.

    SELECT schemas.name AS table_schema,

    tbls.name AS table_name,STATS_DATE(i.id, i.indid) AS lastStatsUpdate,

    i.name AS index_name, (abs(i.rowmodctr) / ( cast(rowcnt as decimal) + 0.01 )) as pcntRowsModified,

    i.id AS table_id,

    i.indid AS index_id,

    i.groupid AS groupid,

    i.rowmodctr AS modifiedRows,

    ( SELECT MAX(rowcnt)

    FROM sysindexes i2

    WHERE i.id = i2.id

    AND i2.indid < 2

    ) AS rowcnt,

    'False' AS Processed

    --INTO ##updateStatsQueue

    FROM sysindexes i

    INNER JOIN SYSOBJECTS AS tbls ON i.id = tbls.id

    INNER JOIN SYSUSERS AS schemas ON tbls.uid = schemas.uid

    INNER JOIN INFORMATION_SCHEMA.TABLES AS tl ON tbls.name = tl.table_name

    AND schemas.name = tl.table_schema

    AND tl.table_type = 'BASE TABLE'

    WHERE i.indid > 0

    AND table_schema <> 'sys'

    AND i.rowmodctr > 0

    AND ( SELECT MAX(rowcnt)

    FROM SYSINDEXES i2

    WHERE i.id = i2.id

    AND i2.indid < 2

    ) > 0 order by lastStatsUpdate desc

  • A little further research shows that sp_updatestats does bump the row counter ( rows modified ) which would cause the weekend fullscan job to select fewer indexes for a stat update. So now we have to decide whether to use the fullscan weekly, nightly sp_updatestats or some combination.

  • Go for the combination. A nightly sp_updatestats followed up by a weekly or monthly fullscan.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • The problem with that is if the nightly resets the row counter, then the weekly full scan won't touch everything desired in the time we allow it. Running a full scan on everything seems to take over 9 hours but we're only giving it a window of 3.5 hours.

    Sp_updatestats only takes 20 minutes on the same300GB database so I doubt it's being as thorough. Statistics seems to be an area many DBAs skim over. I'll have to see if I can find what sp_updateStats does under the hood.

    And also consider one of the "sampled" options for Update Statistics instead of fullscan.

  • Indianrock (1/16/2010)


    I'll have to see if I can find what sp_updateStats does under the hood.

    If you're not specifying any parameters to the proc, it does a sampled update to any statistics set that has had one row or more change since the last stats update,

    Is the nightly/weekly/monthly stats update necessary? Usually, if auto_update is enabled, it's only specific statistics that need manual updating, not the whole lot. It might be worth investigating which ones really need it and only manually updating those.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • The environment in question is our new product and I'm still primarily supporting our legacy environment. So, it's the database architect in the new environment who watches things daily and he definitely has to manually update stats on certain tables during the week on a somewhat regular basis. Yes the automatic options are on.

    What we agreed to yesterday, for a trial period, is disable the nightly sp_updatestats and change the SP that intelligently runs a fullscan on the weekends so it uses no criteria such as "rowmodctr changed on >20% of rows"

    So it will attempt to do fullscan on all indexes, starting with those with the highest number of changed rows for a maximum of 3.5 hours.

    This runs immediately following the index rebuild/reorganize SP -- which does reorg if< 30% fragmentation, etc.

  • Limiting your fullscan tables where > 20% of the rows has changed is senseless. That's the threshold that the auto update uses. If the row changed reaches that, the auto update will kick in. The reason to manually update statistics is because that 20% threshold is too high.

    I would suggest that you consider removing the blanked 'update all stats' and do targeted updates of the ones that you know need manual updating, the ones where the auto update is not sufficient. Add to the list when you find more that you need to manually update. That way, you're not wasting time updating stats that don't need updating and you can be sure of getting to the ones that really need it.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Thanks Gail. So you're recommending targeted updating for problematic indexes weekly. And I suppose more often as needed. Auto updates on.

    It looks like sp_updatestats only has one possible option - resample. If specified it uses the resample option from Update Statistics.

    Here's what I found on auto update stats:

    I see the 20% factor. Not clear on this statement "the selectivity of the predicate also affects AutoStats generation."

    The basic algorithm for auto update statistics is:

    * If the cardinality (myedit: rowcount) for a table is less than six and the table is in the tempdb database, auto update with every six modifications to the table.

    * If the cardinality for a table is greater than 6, but less than or equal to 500, update status every 500 modifications.

    * If the cardinality for a table is greater than 500, update statistics when (500 + 20 percent of the table) changes have occurred.

    * For table variables, cardinality changes does not trigger auto update statistics.

    NOTE: In this strictest sense, SQL Server counts cardinality as the number of rows in the table.

    NOTE: In addition to cardinality, the selectivity of the predicate also affects AutoStats generation. This means that statistics may not be updated after every 500 modifications if cardinality were < 500 or for every 20% of changes if cardinality were > 500. A scale up factor (value ranges from 1 to 4, 1 and 4 inclusive) is generated depending on the selectivity and a product of this factor and the number of changes as obtained from the algorithm would be the actual number of modifications required for AutoStats generation.

  • Indianrock (1/16/2010)


    Thanks Gail. So you're recommending targeted updating for problematic indexes weekly. And I suppose more often as needed.

    Or daily. Or hourly. Whatever you need.

    I see the 20% factor. Not clear on this statement "the selectivity of the predicate also affects AutoStats generation."

    That's for filtered statistics in SQL 2008 only

    The basic algorithm for auto update statistics is:

    If that was for me, I'm well aware what the rules for statistics updates are.

    It's the large tables that tend to have stale stats problems and, in those cases, the threshold can be simplified to 20% since, on a million row table, the additional 500 rows tends to be so small compared to the 20% that it becomes insignificant.

    NOTE: In addition to cardinality, the selectivity of the predicate also affects AutoStats generation.

    As I mentioned before, that's solely for filtered statistics. Since you are using SQL 2005 where there are no filtered stats or indexes, you can ignore all discussion of selectivity of predicates.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • "If that was for me, I'm well aware what the rules for statistics updates are."

    No just posting the documentation for my benefit and whoever else may read this.

    thanks again.

  • Thanks Gail for finishing this one up and explaining in great detail.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • We have some tables with millions of records in them. What we seem to be finding is that, in general, sampled updates of statistics creates problems. Yesterday the application was trying to run a select statement which was timing out. A look at the execution plan showed the number of records returned versus the number of records that statistic/execution plan was showing were very, very different.

    I had run sp_updatestats on the whole database last Thursday night ( 19 minutes ) and then our weekly job run Saturday night -- the latter does a fullscan update to statistics, starting with the indexes with the most row modifications and is limited to 3.5 hours. It updated about 150 statistics out of about 1100.

    We have auto update on, auto create on, and also auto update async on. The last one surprised me when I found it on, but our database architect apparently turned that on quite a while back ( default is off ).

    If you have 10 million records in a table, then auto update would fire after about 2 million row mods -- that's a long time to wait.

  • GilaMonster (1/16/2010)


    ...

    I would suggest that you consider removing the blanked 'update all stats' and do targeted updates of the ones that you know need manual updating, the ones where the auto update is not sufficient. Add to the list when you find more that you need to manually update. That way, you're not wasting time updating stats that don't need updating and you can be sure of getting to the ones that really need it.

    Going back to Gail's post, I would recommend you target your statistics update to very specific tables. Run the update as frequently as necessary for those targeted tables.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • Would you guys want to share your views on whether or not sp_updatestats would be necessary in regular db maintenance job if you have your Auto Update Stats to ON?

  • yes, it is necessary.

    Something I recently ran across. A table with 300 million rows (1.5 TB of archival data, on slow disks)

    last auto-update of stats occurred 9 months ago.

    auto-update of stats gets triggered, async stat updates is off.

    query triggers the auto update.

    client cancels query which cancels the stats update.

    client re-runs query which restarts stats update.

    client cancels the query,

    client re-runs query........... etc etc.

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply