Require Recommendations | Index Rebuild

  • Hi Guys,

    I have created a script to rebuild indexes which is in the attachment-

    NOW - Please review the procedure and let me know if there is need for any modification or if there is anything additional to keep into consideration. Be my guest.

    P.S- Maintenance jobs are not working properly. Hence the script.

    • This topic was modified 4 years, 10 months ago by  JigarShah.
    Attachments:
    You must be logged in to view attached files.
  • JigarShah wrote:

    Hi Guys,

    I have created a script to rebuild indexes which is in the attachment-

    NOW - Please review the procedure and let me know if there is need for any modification or if there is anything additional to keep into consideration. Be my guest.

    P.S- Maintenance jobs are not working properly. Hence the script.

    I have not looked at your script. My suggestion would be to use ola hallengren maintenance script. His script will have everything with new features and support latest version including cloud and third party backup etc. He is dedicated to the script and fixing the issue when reported.

    I have also created my own script but we cannot update make it up to date, all the time.

    Muthukkumaran Kaliyamoorthy
    https://www.sqlserverblogforum.com/

  • I agree.  Download and run Ola Hallengren's solution.  It's far more complete, and is far more flexible, than what you have written.

    https://ola.hallengren.com/

    Michael L John
    If you assassinate a DBA, would you pull a trigger?
    To properly post on a forum:
    http://www.sqlservercentral.com/articles/61537/

  • Ola's code is great but it doesn't help you if you use the current "Best Practice" (it's not) of blindly rebuilding indexes after you've exceeded 30% logical fragmentation.  Even "Books Online" warns against it.

    Here's the link where the supposed "Best Practice" was born... almost no one Reads'n'Heeds the warning tip highlighted in light green just below that "recommendation".

    https://docs.microsoft.com/en-us/sql/relational-databases/indexes/reorganize-and-rebuild-indexes?view=sql-server-ver15

    Here's the warning tip from that link...

     

    Tip

    These values provide a rough guideline for determining the point at which you should switch between ALTER INDEX REORGANIZE and ALTER INDEX REBUILD. However, the actual values may vary from case to case. It is important that you experiment to determine the best threshold for your environment. For example, if a given index is used mainly for scan operations, removing fragmentation can improve performance of these operations. The performance benefit is less noticeable for indexes that are used primarily for seek operations. Similarly, removing fragmentation in a heap (a table with no clustered index) is especially useful for nonclustered index scan operations, but has little effect in lookup operations.

    What I've found the hard way is that a whole lot of people (including me at the time up until 18 Jan 2016) never realize  that the day after index maintenance, they suffer a whole lot of slowdowns in the form of some very significant blocking.  There are several reasons for this...

    1. REORGANIZE doesn't do what you think it does even though it's documented (but poorly explained when it comes to the impact) as to what it does and does not do to an index.  It's also not the tame little kitty that a whole lot of people make it out to be (especially when it comes to log file use) and, because of the way it operates, removes free space at the most critical time when it is needed and does almost nothing to add free space in the  critical area between the Fill Factor and 100% page fullness.  The effect delivered is that REORGANIZE "conditions" a whole lot of indexes in such a fashion as to perpetuate page splits, which is both logical and physical (page density) fragmentation at its worst.  This is why a lot of people mistakenly say that GUIDs cause a lot of fragmentation.  They can actually be setup to run for weeks and months without any page splits at all... not even good ones.  If, however, you use REORGANIZE on such indexes, you end up with perpetual pages splits, instead.  It also affects other types of indexes the same way.  I don't use it even when all I have is the "Standard Edition".
    2. The other thing is that most people don't actually analyze what's going on with their indexes.  The either leave the default Fill Factor of "0" (which is the same as 100) or they assign Fill Factors that not only may not help but may actually harm the system both for performance and space used.  They also don't realize that waiting until 30% fragmentation is actually waiting for the worst to happen insofar as page splits.  For example, for GUIDs and other evenly distributed index, it's great to lower the Fill Fact to 91, 81, of even 71 because that will provide weeks or even months of fragmentation free inserts and some updates but as soon as you have 1% logical fragmentation, you need to do a REBUILD (never REORGANIZE for this) immediately because the only way for such an index to start fragmenting is for ALL the pages are close to 100% full and on the verge of fragmenting.
    3. A lot of people also don't realize that "ExpAnsive Updates" are what is actually causing the fragmentation nor do they realize that inserts will always try to fill pages to 100%.  In only one very rare instance, INSERTs don't care about the Fill Factor and neither do UPDATES.  If you're getting fragmentation on an ever-increasing index, lowering the Fill Factor might only be wasting space and doing nothing at all to prevent fragmentation.  That normally occurs when people do INSERTs to such an index and then UPDATE the rows shortly after ("ExpAnsive" Modified_By columns are the normal but not the only reason).  Since the INSERTs went into at 100% (regardless of Fill Factor), there's no room left for "ExpAnsive Updates" and you immediately get massive fragmentation during the UPDATEs.

    There's a whole lot more to be concerned with but way too much for a forum post. The bottom line is that if you follow the current perception of the 5/30 "guidlines" actually being a "Best Practice", it's extremely likely that no matter what code you use to accomplish it, you're doing more harm than good.

    I'm writing a Stairway on the "Black Arts" of Index Maintenance with proofs for all that I claim but it's going to take a while to write it and get it out there.  I have given several presentations at various SQLSaturdays on the subject to try to begin to make people aware of the problems and that the current "Best Practices" actually aren't... they're guidelines to start with and, as it says in "Books Online", they're not actually very good and can cause more damage than doing no index maintenance other than occasional maintenance to "recover space" according to page density (page fullness).

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Michael L John wrote:

    I agree.  Download and run Ola Hallengren's solution.  It's far more complete, and is far more flexible, than what you have written.

    https://ola.hallengren.com/

    Ola's code is, indeed, the "Gold Standard".  However, despite the claim of it being used to "optimize" your indexes, it does not.  You need  separate processes to figure out what the index is actually being subjected to and REBUILD it both accordingly and at the right time.  Use the 5/30 guidelines will bring you a heap of trouble, especially when larger indexes are involved.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi Jeff,

    Waiting for Stairway. Do you have any threshold, which should be rebuild and reorganize (OR) it depends on their own system and test case.

    I could have some of VLDB with standard rebuild and reorganize thresholds, which is increasing log file growth as well.

    Muthukkumaran Kaliyamoorthy
    https://www.sqlserverblogforum.com/

  • Read my larger post immediately before the short one you responded to.  "It Depends" although I recommend that you simply stop using REORGANIZE except for those things I cited.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden wrote:

    Michael L John wrote:

    I agree.  Download and run Ola Hallengren's solution.  It's far more complete, and is far more flexible, than what you have written.

    https://ola.hallengren.com/

    Ola's code is, indeed, the "Gold Standard".  However, despite the claim of it being used to "optimize" your indexes, it does not.  You need  separate processes to figure out what the index is actually being subjected to and REBUILD it both accordingly and at the right time.  Use the 5/30 guidelines will bring you a heap of trouble, especially when larger indexes are involved.

    When calling Ola's script we can easily remove the reorganize out of the options - including the online as in some cases it may not be the best option either.

    and on the cases where reorganize is required (lob compaction) it is easy to issue a call just for those tables that do require it.

     

  • frederico_fonseca wrote:

    Jeff Moden wrote:

    Michael L John wrote:

    I agree.  Download and run Ola Hallengren's solution.  It's far more complete, and is far more flexible, than what you have written.

    https://ola.hallengren.com/

    Ola's code is, indeed, the "Gold Standard".  However, despite the claim of it being used to "optimize" your indexes, it does not.  You need  separate processes to figure out what the index is actually being subjected to and REBUILD it both accordingly and at the right time.  Use the 5/30 guidelines will bring you a heap of trouble, especially when larger indexes are involved.

    When calling Ola's script we can easily remove the reorganize out of the options - including the online as in some cases it may not be the best option either.

    and on the cases where reorganize is required (lob compaction) it is easy to issue a call just for those tables that do require it.

    Understood but, like I said above, blindly rebuilding indexes and assigning or ignoring Fill Factors can (and usually does) cause many performance, log file, memory usage, backup/restore size/time, and blocking issues especially on the "Morning After".  If you don't know which type of insert/update pattern your suffering, you're frequently doing more harm that good with your index maintenance.  Folks really need to get a better grip on indexes even if they use a fantastic tool like Ola's.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  •  If you don't know which type of insert/update pattern your suffering,

    Hi Jeff,

    Can you recommend a starting point for a scalable way to find out these insert/update patterns across dozens or hundreds of SQL Servers? Or does this need to be checked one-by-one? I completely understand that the ideal approach is to understand these patterns while designing index maintenance for a server, but it seems that DBAs who manage tons of servers have come up with a set of scripts (based on DMVs or other ways of getting insert/update metrics) they can run to assess the insert/update patterns at scale, and then apply tools such as the Ola Hallengren jobs accordingly.

    Thanks for any help,

    -- webrunner

     

    • This reply was modified 4 years, 10 months ago by  webrunner.

    -------------------
    A SQL query walks into a bar and sees two tables. He walks up to them and asks, "Can I join you?"
    Ref.: http://tkyte.blogspot.com/2009/02/sql-joke.html

  • but it seems that DBAs who manage tons of servers have come up with a set of scripts (based on DMVs or other ways of getting insert/update metrics) they can run to assess the insert/update patterns at scale

    Yes, exactly.

    and then apply tools such as the Ola Hallengren jobs accordingly.

    Nah, not for me.  I still use my own customized REBUILD script, so that the script "knows" more about the environment.  Besides, statistics updates are much more common than rebuilds anyway, although both are needed.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • webrunner wrote:

    ... but it seems that DBAs who manage tons of servers have come up with a set of scripts (based on DMVs or other ways of getting insert/update metrics) they can run to assess the insert/update patterns at scale, and then apply tools such as the Ola Hallengren jobs accordingly.

    I've not, of course, polled every DBA that manage a ton of servers or even a few servers but, so far, most of the ones I've talked to at various SQLSaturdays and user groups do one of the following...

    1. Follow the 5/30 guideline of not do maintenance less than 5% logical fragmentation, use {gasp!} REORGANIZE between 5 and 30% logical fragmentation, and do REBUILDS (with {gasp!} ONLINE, if they have the Enterprise Edition) for anything over 30% logical fragmentation.  These same people also believe in the myth of lowing the FILL FACTOR if some index is fragmenting a lot without taking other things into consideration that frequently make lowering the FILL FACTOR not only an ineffective choice, but a real memory/disk space waster.  Most of the people I've talked with follow all of this and it's all one of the worst things you can actually do regardless of the size of the indexes.
    2. Some have taken to using 3rd party solutions with fancy screens that supposedly help you figure out what to do.  I won't mention any of the product names nor any of the free-ware that does this type of thing but they are also sorely wanting because they're not doing the correct analysis even if they collect data over time as part of that analysis.  They're simply analyzing the wrong things.
    3. Some have taken to following the really bad advice of setting everything to a lower FILL FACTOR of 90% (for example).  That's a huge waste of space/memory for a whole lot of indexes because they either don't need a lower FILL FACTOR or lowering the FILL FACTOR will be absolutely ineffective because of things like "ExpAnsive" UPDATES.
    4. Others take on the attitude of either not defragging their indexes unless they find a problem or waiting until something crazy like 70% logical fragmentation to defrag.
    5. Almost all of the people make the serious mistake of defragging indexes they don't have a clue about and then wonder why they have blocking and slowness on the "morning after".  Very few that I've run into actually do any kind of an analysis to figure out the "IndexDNA (tm)" of an index.  They have no clue as to one of the more serious aspects of indexes and that is how to prevent page splits.  If they use REORGANIZE as a general practice, I guarantee that they've not done such an analysis or, if they have, they've not actually done it correctly.

    To summarize, I'll flatly state that, with only a few exceptions, most people are not doing an analysis of any kind and are following some poorly conceived "standard" or "Best Practice" method (that usually isn't a true best practice) for doing index maintenance.  Having done all the above myself, which also means that I've seen all that goes wrong, and can include the poor practice of not doing any index maintenance for nearly 3 years as an experiment to see what happens (which is still better than doing index maintenance incorrectly using any of the techniques I've outlined above), I can tell you that my summarization of what other people do is far from being incorrect.

    To wit, Ola's code is great but very few people are using it correctly.  Ola's code  does no optimization of indexes... it only does what it is told to do.

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • ScottPletcher wrote:

    but it seems that DBAs who manage tons of servers have come up with a set of scripts (based on DMVs or other ways of getting insert/update metrics) they can run to assess the insert/update patterns at scale

    Yes, exactly.

    and then apply tools such as the Ola Hallengren jobs accordingly.

    Nah, not for me.  I still use my own customized REBUILD script, so that the script "knows" more about the environment.  Besides, statistics updates are much more common than rebuilds anyway, although both are needed.

    You're one of the exceptions that I'm talking about, especially if you DON'T use REORGANIZE on a rote basis.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • webrunner wrote:

    Can you recommend a starting point for a scalable way to find out these insert/update patterns across dozens or hundreds of SQL Servers?

    The answer there is "Yes" but I'm still working on the method to make it easier for people.  The biggest part will be to convince them that I'm right because a lot of my findings are diametrically the opposite of what people currently embrace as what have unfortunately become an industry wide "Best Practice".  The only way to do such a thing is prove it to them with the scientific method and the only way to do that in SQL Server is with demonstrable and repeatable test code.

    webrunner wrote:

    Or does this need to be checked one-by-one? 

    The answer here is also Yes.  Every bloody index needs  to be checked one-by-one.  The good news is that there are methods I'm developing check a whole lot of indexes auto-magically and effectively.  There are a lot of exceptions, though...

    You see, there's also the issue of huge numbers of different permutations of insert patterns multiplied by huge numbers of update patterns multiplied by a decent number of delete patterns multiplied by the horror of in-row LOBs and multitudes over oversized variable width columns that both trap small rows (which I also have a fix for that is also the opposite of current "Best Practices), etc, etc, etc.  And I've not even touched on the complexities of partitioned tables/indexes, partitioned views, column store indexes, or memory "optimized" tables, XML indexes or a ton of other things.

    There's also the "exceptions" of absolutely HUGE indexes.  They require some specially handling even if I were to give people an absolutely fool proof method of identifying everything that was needed to know about any given one simply because MS didn't build "Peter Norton-like" defragmentation capabilities into SQL Server.  REORGANIZE comes close but at the horrible expense of being both fully logged (actually, almost double logged) and not being able to create new pages.  Even REBUILD requires some special handling of huge indexes or you end up with huge amounts of blown out logs and wasted allocated unused space in the MDF/NDF files.  Add in some requirement for log shipping or other log file preservation requirement and you've pretty much stripped yourself of any reasonable options at effective index maintenance.

    So, yes and no... I have some stuff that works great for me and could work great for others but, there is no panacea and I know of no current product or homegrown method that comes even close to being one.  I will tell you I've come pretty close for non-partitioned, non-XML, rowstore indexes that still need special considerations for huge tables and I'm working on solutions for those, as well.

    I'll also tell you that my long term toil on this subject is due to the fact that my databases were a victim of 3rd party code that followed a rendition of current "Best Practices" and it also made the mistake of building statistics on every column of every table in every database on my production system.  It all culminated on the Monday, January 18th, 2016 with massive blocking that was directly caused by such "Best Practice" index maintenance executing the night before, which also caused some pretty nasty log file size problems because of one of those huge indexes I was telling you about.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • webrunner:

    You'd get orders of magnitude more improvement by changing all tables to make sure they have the best clustered index for overall performance and based on how data is (almost) always searched for, rather than just rebuilding the indexes you have now.  Hint: the best clus index is most often not on an identity column.

    For example, for child tables, it's almost always best to have the parent's key first.  Thus, the order items table would be clustered first on the order_id column, and only then the item identity.  Since the order_id is an identity (and the clustering key, btw), on the orders table, the item table is generally inserted in order as well.  But now the join between the tables is vastly more efficient, typically a MERGE JOIN, reading joined contiguous rows from the order items table and SQL "knows" they're contiguous.  With an identity cluster on the items, SQL has to go thru a non-clus index to do the joins and SQL assumes the rows aren't in contiguous order (the only safe assumption, since they may not be).

    Similarly, intersection tables should typically be keyed the actual keys of the parent tables, not by a meaningless identity.

    Thus, since there are far more child / dependent tables than "master" / parent tables, it's only logical that the majority of tables are best clustered on something other than identity the vast majority of the time.

    Only after determining and setting the best clus index, should you then review all the non-clus indexes.  Often many of them can now be removed, since they are keyed first by what is now the clus key(s).

    The really complex part is determining which non-clus indexes should be combined.  There's some art with the science there.  Scanning an existing index can be much better than creating a separate index, but you naturally don't want to get too carried away with it.

    A specific example(s) would be much clearer.  Generalities provide some guidance, but nothing like walking thru a few real-world situations.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply