Update statistics after Index Rebuild ?

  • lianvh 89542 (8/17/2016)


    Thank you for the feedback. It's much appreciated. One question though : Is it possible to replace Index rebuild with update statistics - fullscan?

    You can, though whether that's a good idea or not will depend on the system and the usage.

    Fragmentation itself only affects large range scans from disk, however one side effect of fragmentation is low page density, meaning your tables take up more space in memory and more space on disk than they need to. That may be a problem, especially if you don't have a huge amount of memory on the server. Or it may not be a problem. You will need to investigate and test.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • MyDoggieJessie (8/17/2016)


    Hey Jeff - care to share your job script for the stats?

    😀

    It's not my script. The folks before me downloaded it from the internet. To be absolutely honest, it's been on my list of things to replace because the damned thing automatically builds per column stats on every column that doesn't already have them. That's a real killer for me. For example, I have several monster tables that were improperly designed with more than 140 columns each. Only 4 of the columns are ever used as query criteria. I realize that stats don't take much space but they take a huge amount of time to rebuild. It takes 17 hours for the stats to rebuild on just one of my "money maker" databases and that's just plain silly. I've written and am in the process of testing a stored procedure that examines all cached execution plans to see which stats are actually used over time (undocumented trace flag used, as well) so that I can "denude the tree" a bit.

    So no... can't share anything yet because the existing one sucks and I'm not done with my replacement for it, yet.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Tables with 140 columns? Ha! :hehe: In one of our databases (NOT my design by the way) we have a few tables that have 1,024 columns (yeah, the maximum allowed by SQL Server)...and it's split into 3 different tables because more were needed:

    BigTable_Primary

    BigTable_Secondary

    BigTable_Tertiary

    Pretty crazy...

    ______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

  • So just to clarify, since I've personally never ran an update of statistics on any of the databases I've administered, proper maintenance is, for each table in a database, to first update statistics, and then do a rebuild or reorganize based on index fragmentation?

  • I wouldn't do it that way round.

    Index rebuild first, if necessary, afterwards update any stats that need updating.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Now I'm even more confused. Earlier in this thread, Grant suggested updating the statistics after an index rebuild was not the right approach, since the statistics are already updated by the rebuilding of the index. And I'm also not clear on what you mean by rebuilding the statistics "that need updating" - how do I know if they need updating?

    Sorry if I'm hijacking the thread, but this whole topic is rather new to me!

  • kramaswamy (8/17/2016)


    Earlier in this thread, Grant suggested updating the statistics after an index rebuild was not the right approach, since the statistics are already updated by the rebuilding of the index.

    Which doesn't, in any way, contradict what I said. Any stats updated as part of the index rebuild wouldn't need updating again, because they would have just been updated. There are stats which are not associated with indexes (column stats) which don't get updated when indexes are rebuilt, and if you're rebuilding only fragmented indexes, then the stats on the ones that didn't get rebuild may still need updating.

    And what he said was specifically about the stored proc sp_updatestats

    In short, no, you can't both rebuild indexes and then run sp_updatestats with any assurance that you're not actively harming your performance.

    And Grant touched on how you tell whether stats are 'out of date'

    Also worth mentioning is that the rowmodctr value after 2005 is just an estimate. Microsoft says it will be, something along the lines of, "mostly the same values" as it was in 2005 and earlier. So, while it may be a somewhat inaccurate number, it's the one that most of us will use to understand what's changed. Jason Strate has a more accurate calculation on his blog if you really need to track down a more precise count of rows that have been modified.

    Should be enough to get you started with research & reading.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Ahh okay, I think I understand what you mean. If I'm reading you right, we're essentially talking about columns which are not part of an index, but are still used regularly for query purposes?

    If the statistics are not up to date, then the query optimiser might choose the wrong approach to handling the query, which could cause the query to take longer than it should to run. However, if the query were a long-running query, I would likely have already included the relevant columns in an index, and so the statistics for those columns would probably have already been maintained through the normal maintenance of the index.

    Am I getting this right?

  • Yup, but those column stats are usually very important (if they weren't necessary the optimiser wouldn't have had them created), and there's still the index stats on indexes that you're not rebuilding because they're not fragmented enough.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Yes to everything Gail says. Plus, we're not even addressing whether or not we need to worry about sample size or Full Scan on the statistics updates. Different statistics and different data distributions and different table sizes are all going to mean different choices are necessary. This gets complicated quick. Most of the time, most of us, won't need to get this granular. But some of the time, we will. It's good to know the choices and methods in order to be able to address this issue.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Excellent points; sincerely appreciate you both for taking the time to clarify things.

    ______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

  • GilaMonster (8/17/2016)


    Yup, but those column stats are usually very important (if they weren't necessary the optimiser wouldn't have had them created), and there's still the index stats on indexes that you're not rebuilding because they're not fragmented enough.

    Heh... unfortunately for me, the folks before me downloaded a "stats maintenance system" from the internet that scans all columns for missing stats and creates column stats for every column that doesn't have them whether they're need or not. They don't take much room to store but, OMG, they take 17 hours to run. It also names the stats in a fashion almost identical to how SQL Server does it, making it very difficult to determine which stats were artificially created so I can drop them. Add the extra complexity that some of those stats might actually be needed now and you'll understand why I don't just delete stats based on a minor naming nuance.

    I have determined a way to figure out which stats are being used by cached execution plans over time but I'm wondering if that's just crazy overkill. I'm toying with the idea of just dropping all the stats that have the minor nuance in the name that I spoke of and letting the system sort it out.

    Opinions on this "little" problem are welcome. I've never seen anything quite this crazy.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (8/18/2016)


    It also names the stats in a fashion almost identical to how SQL Server does it

    BWA-HA-HA!

    I was just having a conversation about statistics in India last week when naming conventions came up. Several of us joked about how we should follow the Microsoft standard. Who the hell knew we were being serious? That's perfectly marvelous (and totally messed up).

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey (8/18/2016)


    Jeff Moden (8/18/2016)


    It also names the stats in a fashion almost identical to how SQL Server does it

    BWA-HA-HA!

    I was just having a conversation about statistics in India last week when naming conventions came up. Several of us joked about how we should follow the Microsoft standard. Who the hell knew we were being serious? That's perfectly marvelous (and totally messed up).

    Does the WA in "WA_XXXX" stand for Washington?

  • BLOB_EATER (8/18/2016)


    Grant Fritchey (8/18/2016)


    Jeff Moden (8/18/2016)


    It also names the stats in a fashion almost identical to how SQL Server does it

    BWA-HA-HA!

    I was just having a conversation about statistics in India last week when naming conventions came up. Several of us joked about how we should follow the Microsoft standard. Who the hell knew we were being serious? That's perfectly marvelous (and totally messed up).

    Does the WA in "WA_XXXX" stand for Washington?

    Yes, although I can't tell you why.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

Viewing 15 posts - 16 through 30 (of 35 total)

You must be logged in to reply to this topic. Login to reply