Rebuilding and Reorganizing Indexes

  • I am not familiar with any restrictions on features for SQL Server in Amazon RDS so I don't know if you have access to the SQL Server Agent in Amazon RDS or not. If you do, then perhaps you can still use Ola's index maintenance scripts. Any index maintenance involving rebuilding your indexes, which is what needs to happen in most cases for you based on earlier screenshots can cause some blocking, which is why you want to do this type of maintenance during a less busy time for your SQL Servers. Since there has been no maintenance at all, an immediate thing you can do that will help your situation is to update the statistics. You can do that by connecting to your instance of SQL Server and run the below code.

    Statistics are small objects in the database tables that keeps track of the distribution of the data in your tables. Having current information in those statistics objects helps SQL Server build better execution plans for queries. When statistics are not current, SQL Server will often make less than ideal choices for execution plans because it is unaware of how the data is distributed. Updating the statistics by the below script will cause SQL Server to have a better idea of the distribution of data and may then help with your query performance.

    After doing some googling about Amazon RDS, there is a page here that shows you don't have the ability to call sp_updatestats directly in RDS. Instead, you have to create a procedure that encapsulates sp_updatestats like the example shown in the link.

    See this link where it is shown that updating statistics does not cause blocking. Click here

    DECLARE @TSQL nvarchar(2000)

    -- Filtering system databases from execution
    SET @TSQL = '
    IF DB_ID(''?'') > 4
    BEGIN
    USE [?]; exec sp_updatestats
    END
    '
    -- Executing TSQL for each database
    EXEC sp_MSforeachdb @TSQL

    • This reply was modified 3 years, 12 months ago by  lmarkum.
    • This reply was modified 3 years, 12 months ago by  lmarkum.
  • Be very careful executing sp_updatestats - it uses the default sampling rate when updating statistics which may not be good enough.  It can also be a problem if you run that right after rebuilding indexes - as it could update statistics for an index where those stats were just updated with a full scan.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Jeff is correct that there is no point in updating statistics after an index was rebuilt because rebuilding the index includes updating the statistics. sp_updatestats is a wrapper procedure for update statistics. It isn't dangerous to use sp_updatestats but it does use the default sampling of the database engine, but so does Auto Update Statistics setting that is on by default for the SQL Server. There can be times when a targeted use of update statistics with the FULLSCAN option or some other sampling percentage could be more appropriate. In the case of the OP, where no maintenance is occurring, sp_updatestats should be fine.

  • Here is a piece of code that selects the individual statistics and generates the update statistics command.

    It will do a fullscan of the statistics.  Like Jeffery said, if you rebuild an index, you would not want to update stats for a freshly rebuilt index.   I certainly would not blindly run the commands this creates. I would start with the statistics that have the highest number of rows or percentage of rows modified and work down from there.

    Keep in mind that you may see the percentage or number of rows modified that are higher than the number of rows in the table. This counter is incremented each time a row is modified.  So, if the same row is modified 10 times, and there is only one row in the table, the counter is going to show 10.

    Updating each individual statistic is usually far less intrusive that using sp_updatestats on the entire table.

    One more useful set of tools are Brent Ozar's First Responders toolset.  sp_blitzindex may help you pinpoint the troublesome indexes in your system.  https://www.brentozar.com/first-aid/

    SELECT 
    DB_NAME(db_id()),
    S.name,
    O.name,
    O.object_id,
    STAT.name,
    CONVERT(numeric(18,2), ROUND((CONVERT(numeric(18,2),STATPROP.modification_counter) /CASE WHEN STATPROP.rows = 0 THEN 1.00 ELSE CONVERT(numeric(18,2),STATPROP.rows) END * 100), 2)) PctModified,
    STATPROP.rows,
    STATPROP.modification_counter,
    STATPROP.last_updated,
    'UPDATE STATISTICS [' + S.name + '].[' + O.name + ']([' + STAT.name + ']) WITH FULLSCAN;' UpdateStatisticCommand
    FROM sys.objects O
    INNER JOIN sys.schemas S ON O.schema_id = S.schema_id
    INNER JOIN sys.stats STAT ON STAT.object_id = O.object_id
    CROSS APPLY sys.dm_db_stats_properties(STAT.object_id, STAT.stats_id) STATPROP
    WHERE S.name <> 'sys'
    AND STATPROP.modification_counter > 0
    ORDER BY CONVERT(numeric(18,2), ROUND((CONVERT(numeric(18,2),STATPROP.modification_counter) /CASE WHEN STATPROP.rows = 0 THEN 1.00 ELSE CONVERT(numeric(18,2),STATPROP.rows) END * 100), 2)) DESC

     

    Michael L John
    If you assassinate a DBA, would you pull a trigger?
    To properly post on a forum:
    http://www.sqlservercentral.com/articles/61537/

  • There are a couple of things you can do to affect how sp_updatestats works.  There is a parameter available that forces the statistics to be updated with the previous sampling rate - and there is now a way of defining a persisted sampling rate.  Setting that value forces auto update statistics and this procedure to use the persisted rate instead of the default rate.

    This does require analysis of your statistics to determine which ones need a higher sampling rate though.  You should not just set that to 100 for all.

    I think the OP has a lot more problems than just index rebuild and update statistics.  If - as he stated - these tables all have a 'default id' as the clustered index (assuming he means identity column as the clustered index), then these fragmentation levels are extremely concerning.  Using an identity column does not preclude or prevent fragmentation...but you should not get to that level of fragmentation unless your data access patterns are such that every single row in the table is updated, and updated in such a manner as to cause the data to grow (e.g. increasing sizes of varchar columns - expansive updates).

    If the 'default id' is not an identity - it could be a GUID which definitely means the clustering key is incorrect and a large contributor to the fragmentation issues.

    In either case - to really determine what is going on we would need to see the page density in conjunction with the fragmentation.  I am betting that the page density is quite low in almost all of these tables...but that is just an educated guess and I could be wrong.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • If you have fragmented indexes and you make the mistake of doing index maintenance on indexes with a "0" or "100" Fill Factor (they're both the same), you will suffer major blocking and log file usage for several hours after the index maintenance.  It's better to do no maintenance than it is to do it wrong.

    If you have a very high page density and a high fragmentation %, then it's probably a "Sequential Silo" index.  These are permanently fragmented and lowing the fill factor won't help but they don't suffer from page splits, either.  Start out by rebuilding them at a 98 Fill Factor where the "8" (looks like the infinity symbol turned 90 degrees) and then you can use REORGANIZE on them after that.  Note that this is one of only three types of indexes where you can effectively use REORGANIZE.   LOB compaction is one other reason you should use REORGANIZE and the third is any index that suffers heavy deletes, although rebuild can use much less log file and do a better job in the latter case.

    The reason to generally avoid REORGANIZE is because it "compacts" data in pages.  It is NOT able to create new pages to take advantage of a reduced Fill Factor.  The generally means that REORGANIZE will frequently remove free space on pages when you need it the most and that will actually increase and perpetuate fragmentation.  This is the big reason why everyone poo-poos the use of Random GUIDs.  Reorganize is actually the problem there and if you stop using it on Random GUIDs with then correct fill factor, you can actually go weeks and even months with zero page splits, zero fragmentation, and zero index maintenance.  If your index IS keyed by Random GUID, the rebuild it at 81% where the "1" reminds you that you actually need to rebuild at 1% fragmentation to prevent "avalanche fragmentation" once all the pages are filled to capacity.  1% sounds counter intuitive especially for GUIDs but it's not.  You won't end up rebuilding as much as you might think (like I said, can go weeks or even months with no index maintenance).

    If you have indexes that have an "ever-increasing" key that also suffer from fragmentation, then you have "ExpAnsive Updates" occurring.  You eventually need to find and fix the expansive updates.  There are two types "ExpAnsive Updates"... mid index updates, which a lower fill factor can help with and the type where rows are inserted, then almost immediately updated.  If the former, you might want to rebuild at 82% where the "2" reminds you that you still have something "to do" to fix the index.  If the latter, lowering the fill factor will NOT help.  Rebuild the index at 97% (the "7" looks like a "2" with no foot and reminds you you still have something "to do" to it).  You can use REORGANIZE on the latter with the understanding that it won't be as good as a REBUILD.

    There's a whole lot more to the "Black Arts" of Index Maintenance but those are some safer starting points than just throwing some index maintenance routine that uses supposed "Best Practices" (and they're not... even Paul Randal, the guy that wrote them, says to take them with a "grain of salt") at your indexes.

    And, to be clear... no... rebuilding all of your indexes at 82% (or any other number) is NOT the ultimate thing to do.  It's meant just to get you "out of the woods" until you take the time (which you MUST do, especially with the big ones) to figure things out about each index.  You really should know how each index is used and will react or the morning afterward is going to be one big blocking session.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • christi1711 wrote:

    First of all, I saw that we have Amazon RDS for SQL Server.

    Regarding script for index maintenance (https://ola.hallengren.com/) it looks like the same as what I want to do: (rebuild or reorganize indexes). I have a clustered index on each table (for the most part, this is the default id). Also, I look at the sys.indexes, sys.dm_db_index_usage_stats, sys.dm_db_partition_stats, and found which indexes don't use. For which period we have such results?

    And what means update statistics? Can it block my DB?

    I've been using Ola's scripts for some time now and would not go back to maintenance plans. Read the documentation to ensure that you are happy with the configuration options for the commands you will be running

    Add the following into the command for the index maintenance jobs.

    @FragmentationMedium = 'INDEX_REORGANIZE',

    @FragmentationHigh = 'INDEX_REORGANIZE',

    This will limit the options to defrag only or will be skipped if page locks are disabled on a table.

    Statistics are used by SQL Server to determine the best query plan to use. If you stats are out of date, a less than optimal plan may be used. Updating stats will not cause any issues with the DB. Ola's scripts update the stats by default.

     

  • @SQLNumpty wrote:

    christi1711 wrote:

    First of all, I saw that we have Amazon RDS for SQL Server.

    Regarding script for index maintenance (https://ola.hallengren.com/) it looks like the same as what I want to do: (rebuild or reorganize indexes). I have a clustered index on each table (for the most part, this is the default id). Also, I look at the sys.indexes, sys.dm_db_index_usage_stats, sys.dm_db_partition_stats, and found which indexes don't use. For which period we have such results?

    And what means update statistics? Can it block my DB?

    I've been using Ola's scripts for some time now and would not go back to maintenance plans. Read the documentation to ensure that you are happy with the configuration options for the commands you will be running

    Add the following into the command for the index maintenance jobs.

    @FragmentationMedium = 'INDEX_REORGANIZE',

    @FragmentationHigh = 'INDEX_REORGANIZE',

    This will limit the options to defrag only or will be skipped if page locks are disabled on a table.

    Statistics are used by SQL Server to determine the best query plan to use. If you stats are out of date, a less than optimal plan may be used. Updating stats will not cause any issues with the DB. Ola's scripts update the stats by default.

    Just a bit of friendly advice (since it ate my face off way back in Jan 2016), be real careful about using REORGANIZE in any form of a blanket fashion.  Most people miss the fact that it cannot and will not create new pages to clear the area above the FILL FACTOR and will, in fact, remove free space below the Fill Factor at the worst possible time resulting in very full pages that will actually cause and perpetuate fragmentation in all but a few cases.  This is especially true for any kind of evenly or nearly evenly distributed index keys such as but not limited to GUIDs.  It also brow beats the dickens out of your log files.

    Also, if you have fragmented indexes that have a zero (or 100) Fill Factor and you're doing any kind of defragmentation on them, stop defragging them until you know the INSERT/"ExpAnsive Update" pattern of the index and have both repaired it and assigned a proper Fill Factor.

    Last but not least, remember that lowering a Fill Factor on an ever-increasing keyed index that suffers Inserts and then those newly inserted rows are updated with an "ExpAnsive Update" (typically and as an example, something like a "Modified_By" column going from NULL to some value) does absolutely nothing to prevent fragmentation and only wastes space (a LOT of space).  You need to fix the "ExpAnsive Updates", instead.

    On a slightly different but extremely related topic, if you are using any of the MAX datatypes or the XML data type (all are referred to as LOBs) and you haven't set the table option to 'large value types out of row', your table and range scans on the Clustered Index are likely suffering greatly and your page density may be suffering a whole lot from "trapped short row pages".

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Or could be something like this:

    EXECUTE dbo.IndexOptimize

    @databases = 'ALL_DATABASES',

    @StatisticsSample = 100,

    @FragmentationLow = NULL,

    @FragmentationMedium = 'INDEX_REORGANIZE,INDEX_REBUILD_ONLINE,INDEX_REBUILD_OFFLINE',

    @FragmentationHigh = 'INDEX_REBUILD_ONLINE,INDEX_REBUILD_OFFLINE',

    @FragmentationLevel1 = 50,

    @FragmentationLevel2 = 80,

    @UpdateStatistics = 'ALL',

    @OnlyModifiedStatistics = 'Y',

    --@SortInTempdb = 'Y',

    @LogToTable = 'Y',

    @Resumable = 'Y',

    @TimeLimit = 14400,

  • I strongly recommend against that.  If you're waiting for a 50-80 mix of fragmentation, you've already waited for most of the fragmentation to happen.  This is now in the "why bother?" area and, to be sure, if you do reorg/rebuild now on all those indexes that you've not actually examined that have a "0" Fill Factor, you're going to pay hell on the "morning after" with shedloads of page splits and blocking.

    If you're using REORGANIZE for anything except for LOB compaction or indexes that have pages fuller than the Fill Factor or the Fill Factor is 0/100 (for either reorg or rebuild), then you're inviting all the page splits in the world to camp out on your server the morning after.  Remember, the indexes fragmented for a reason and doing any kind of index maintenance on 0/100 Fill Factor indexes is going to remove all free space and THAT is going to cause even more massive fragmentation than you already have.  If you're not going to figure out what's going on with your indexes, it's actually better to not do any index maintenance on your 0/100 Fill Factor tables and just do stats updates.

    That's NOT a guess on my part.  I went from 18 Jan 2016 until early 2020 doing just that and there were no performance issues especially compared to what was happening before that fateful day in 2016.

    If you have some indexes that have woefully low (less than 80% average percent of page fullness [page density]), then rebuild (NOT reorganize) them at 82% to recover the disk space and leave room for out-of-order inserts and ExpAnsive Updates until you can actually figure out what's going on with the index.  Why "82"?  The "2" is there to remind you that you still have something "to do".

     

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden wrote:

    If you have some indexes that have woefully low (less than 80% average percent of page fullness [page density]), then rebuild (NOT reorganize) them at 82% to recover the disk space and leave room for out-of-order inserts and ExpAnsive Updates until you can actually figure out what's going on with the index.  Why "82"?  The "2" is there to remind you that you still have something "to do". 

    You keep using "N2" and other such "magic" fillfactors as a "to do" reminder?  Seriously?  If for some reason you're not available, do you expect anyone else to understand what your "magic" fillfactors mean?  I have to admit, it wouldn't occur to me that "82" was supposed to tell me I had "something to do" with that index.

    I don't mean to be harsh, or worse Celko-like, but that seems a very odd expert advice to give to people.

    Why not instead have a table that states why specific fill factors were used for certain indexes, if they were, and what the maintenance pattern and needs of this index are, if they need special consideration?  That's what I do.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • ScottPletcher wrote:

    Jeff Moden wrote:

    If you have some indexes that have woefully low (less than 80% average percent of page fullness [page density]), then rebuild (NOT reorganize) them at 82% to recover the disk space and leave room for out-of-order inserts and ExpAnsive Updates until you can actually figure out what's going on with the index.  Why "82"?  The "2" is there to remind you that you still have something "to do". 

    You keep using "N2" and other such "magic" fillfactors as a "to do" reminder?  Seriously?  If for some reason you're not available, do you expect anyone else to understand what your "magic" fillfactors mean?  I have to admit, it wouldn't occur to me that "82" was supposed to tell me I had "something to do" with that index.

    Why not instead have a table that states why specific fill factors were used for certain indexes, if they were, and what the maintenance pattern and needs of this index are, if they need special consideration?  That's what I do.

    If someone sets the Fill Factor to some "N2" value (as you say) and then leaves the company, the people that replace them are probably going to make all the same mistakes anyway.  If you have a separate table to keep track of it, they're just as likely to ignore it and they're just as likely to say "well... we need to change all this stuff to use REORGANIZE".

    There's more to worry about than the nitpick you just cited.

    What are you doing for index maintenance, Scott?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden wrote:

    ScottPletcher wrote:

    Jeff Moden wrote:

    If you have some indexes that have woefully low (less than 80% average percent of page fullness [page density]), then rebuild (NOT reorganize) them at 82% to recover the disk space and leave room for out-of-order inserts and ExpAnsive Updates until you can actually figure out what's going on with the index.  Why "82"?  The "2" is there to remind you that you still have something "to do". 

    You keep using "N2" and other such "magic" fillfactors as a "to do" reminder?  Seriously?  If for some reason you're not available, do you expect anyone else to understand what your "magic" fillfactors mean?  I have to admit, it wouldn't occur to me that "82" was supposed to tell me I had "something to do" with that index.

    Why not instead have a table that states why specific fill factors were used for certain indexes, if they were, and what the maintenance pattern and needs of this index are, if they need special consideration?  That's what I do.

    If someone sets the Fill Factor to some "N2" value (as you say) and then leaves the company, the people that replace them are probably going to make all the same mistakes anyway.  If you have a separate table to keep track of it, they're just as likely to ignore it and they're just as likely to say "well... we need to change all this stuff to use REORGANIZE".

    There's more to worry about than the nitpick you just cited.  As you say, "Seriously"?

    With that, "enquiring minds want to know"... What are you doing for index maintenance, Scott?

    Given that you've stated this multiple times with an assortment of different "magic" fillfactors -- 51, 61, 82, etc. -- I don't think it's a "nitpick" at all.  So, yeah, do you seriously recommend that others use "magic" fillfactor values to indicate that later action(s) should be taken against those indexes?

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • It is worth mentioning that REORGANIZE does not update statistics. REBUILD does.

    It is also worth mentioning that omitting FILLFACTOR defaults index fillfactor server setting (which is set to 0 when installing SQL Server).

    An easy demonstration reveals that changing the server-wide setting to 50gives an interesting result.

    Run this

    EXEC sys.sp_configure 'fill factor (%)', 50;
    RECONFIGURE WITH OVERRIDE;

    and restart server. Now execute this

    CREATE TABLE [dbo].[tbl_user](user_id bigint IDENTITY(1,1) CONSTRAINT PK_tbl_user PRIMARY KEY CLUSTERED,fname varchar(128) NULL,lname varchar(128) NULL,email varchar(128) NULL)

    INSERT dbo.tbl_userSELECT number, number, number FROM master.dbo.spt_values;

    Now we have created a test table.

    Execute this to see the fillfactor in use. You would expect 50% right?

    DBCC SHOWCONTIG (tbl_user);

    You are wrong. SQL Server uses 100% no matter what.

    What happens if we rebuild the index directly?

    ALTER INDEX ALL ON tbl_user REBUILD;
    DBCC SHOWCONTIG (tbl_user);

    Now the fillfactor is 50% as expected.

    What if we fiddle with the fillfactor?

    ALTER INDEX ALL ON tbl_user REBUILD WITH (FILLFACTOR = 100);
    DBCC SHOWCONTIG (tbl_user);

    ALTER INDEX ALL ON tbl_user REBUILD WITH (FILLFACTOR = 75);
    DBCC SHOWCONTIG (tbl_user);

    ALTER INDEX ALL ON tbl_user REBUILD WITH (FILLFACTOR = 50);
    DBCC SHOWCONTIG (tbl_user);

    ALTER INDEX ALL ON tbl_user REBUILD WITH (FILLFACTOR = 25);
    DBCC SHOWCONTIG (tbl_user);

    As you can see, they all work as expected. But what about the first rebuild, with no fillfactor?

    ALTER INDEX ALL ON tbl_user REBUILD;
    DBCC SHOWCONTIG (tbl_user);

    And it is still 25% because that is the last value used for the index.

    The conclusion is that we have a mismatch. No matter what the server-wide fillfactor is set to, until a rebuild of the index SQL Server will use 100%.

    Then doing a REBUILD without fillfactor SQL Server will use the last known fillfactor which is 50% (but not until then).

    Don't forget to switch back to 0 as default for server wide fill factor.


    N 56°04'39.16"
    E 12°55'05.25"

  • ScottPletcher wrote:

    Given that you've stated this multiple times with an assortment of different "magic" fillfactors -- 51, 61, 82, etc. -- I don't think it's a "nitpick" at all.  So, yeah, do you seriously recommend that others use "magic" fillfactor values to indicate that later action(s) should be taken against those indexes?

    I've wouldn't recommended something as low as "51" or "61" unless there's an extreme case where such a low fill factor would provide a benefit (it usually won't) but, yes... I don't make such recommendations unless I use those same recommendations including what you're calling "magic" fill factor numbers.

    Your turn, Scott.  Please answer my question... what are you doing for index maintenance?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 16 through 30 (of 38 total)

You must be logged in to reply to this topic. Login to reply