Online Indexing

  • Without knowing more about your index rebuilding script, I can't know if this information will be helpful. I'll volunteer it anyway:

    If you're using the function call "sys.dm_db_index_physical_stats" in order to determine if and which indexes to rebuild, you'll find that its execution on a large database with hundreds of indexes takes much longer to execute than the actual index rebuilding operations. After considerable experimentation, I've been able to only make small improvements in its execution times.

    If you are having to rebuild your indexes OFFLINE, then you're going to negatively impact the performance of any query that depends on that index while it is OFFLINE. It will be unavailable. This will most likely cause table scans which will further deteriorate your server's performance and extend the time to rebuild the index(es).

    I suggest that you survey all of your indexes. Determine the ones that are severely fragmented, and write SQL scripts to specifically rebuild those indexes ONLINE, if you're version of SQL Server will support ONLINE index rebuilding. I recommend staggering your index rebuilding operation jobs during off hours (if your server has off hours) such that you rebuild an index. Wait a little bit. Rebuild the next index. Wait a little bit. Repeat until all of the indexes that need rebuilding are rebuilt.

    I would generally recommend that any index that is over 40% fragmented be rebuilt.

    However, there are other factors to consider. For example: The inbound for our main server has very high traffic. I've evaluated the most severely fragmented indexes (90+% and higher) affected by inbound traffic and found that some of them go from 0% fragmentation to over 90% fragmentation within 10 minutes after they are rebuilt. This is true even though I have set the fill factor as low as 40%.

    We no longer rebuild those indexes. It is a waste of time and server resources. Interestingly, I have proved that even though some of the indexes are 98%-99% fragmented, they are still used by the SQL Server Optimizer for creating query plans. This contradicts just about everything I've read about the Optimizer but I've proven it.

    I hope this helps and gives you some ideas about how to selectively rebuild your indexes and reduce the time to execute the index rebuilding operations on your server.

    LC

    P.S. I specifically recommend that you do not REORGANIZE your indexes. Reorganization (defragmentation) takes about 8 times longer than rebuilding, if pages in the table are exclusively locked, the reorg will skip the sections of the indexes affected by the locks, and reorg will not update your statistics like an index rebuild will.

  • Good day,the server is a virtual server running on a SAN . How accurate will the I/O Stall readings be , even if we were to use SCOM ?

  • Good Day,

    I was under the impression that online indexing does not lock other transactions,but aparently there are still locks blocking other transactions .

  • lianvh (4/21/2011)


    Good Day,

    I was under the impression that online indexing does not lock other transactions,but aparently there are still locks blocking other transactions .

    I have had to resolve similar issues... Took several steps and lots of testing.

    There is a very important setting when you create (or Alter) an index that helps determine this. i.e. My Skeleton non-clustered Index Creation Script:

    CREATE NONCLUSTERED INDEX [XXX_NCIX] ON [dbo].[TableName]

    ([Coulumn_Name] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = ON, IGNORE_DUP_KEY = OFF, DROP_EXISTING = ON, ONLINE = ON, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    We found that if we did not enable ALLOW_ROW_LOCKS and ALLOW_PAGE_LOCKS TSQL that used the index and needed this level of lock on data had issues.

    We also discovered that explicitly defining the type of lock to be used in all SQL that created, updated, or deleted data rows in DB tables using table hints recuced these locking issue occurances from hourly to yearly.

  • P.S. I specifically recommend that you do not REORGANIZE your indexes. Reorganization (defragmentation) takes about 8 times longer than rebuilding, if pages in the table are exclusively locked, the reorg will skip the sections of the indexes affected by the locks, and reorg will not update your statistics like an index rebuild will.

    That is absolutely a BAD recommendation. REORGANIZE is definitely an appropriate action for certain scenarios - quite a few of them actually.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • TheSQLGuru (4/22/2011)


    P.S. I specifically recommend that you do not REORGANIZE your indexes. Reorganization (defragmentation) takes about 8 times longer than rebuilding, if pages in the table are exclusively locked, the reorg will skip the sections of the indexes affected by the locks, and reorg will not update your statistics like an index rebuild will.

    That is absolutely a BAD recommendation. REORGANIZE is definitely an appropriate action for certain scenarios - quite a few of them actually.

    I agree with Kevin here. You should be doing statistics maintenance in addition to index maintenance anyway.

  • Jack Corbett (4/22/2011)


    TheSQLGuru (4/22/2011)


    P.S. I specifically recommend that you do not REORGANIZE your indexes. Reorganization (defragmentation) takes about 8 times longer than rebuilding, if pages in the table are exclusively locked, the reorg will skip the sections of the indexes affected by the locks, and reorg will not update your statistics like an index rebuild will.

    That is absolutely a BAD recommendation. REORGANIZE is definitely an appropriate action for certain scenarios - quite a few of them actually.

    I agree with Kevin here. You should be doing statistics maintenance in addition to index maintenance anyway.

    That's a very broad claim with no qualifiers. Why should he be doing statistics maintenance in addition to index maintenance?

    My quoted information about REORGANIZE and its liabilities were taken from the Microsoft white paper: http://technet.microsoft.com/en-us/library/cc966523.aspx. In addition to that document providing extensive details on the performance and characterisitics of REBUILD and REORGANIZE, search on this phrase for a table that succinctly compares the two: "Table 5 Comparison of DBCC DBREINDEX to DBCC INDEXDEFRAG".

    If you know of virtues associated with REORGANIZE that Microsoft does not, I'd like to hear about them.

    LC

  • crainlee2 (4/22/2011)


    Jack Corbett (4/22/2011)


    TheSQLGuru (4/22/2011)


    P.S. I specifically recommend that you do not REORGANIZE your indexes. Reorganization (defragmentation) takes about 8 times longer than rebuilding, if pages in the table are exclusively locked, the reorg will skip the sections of the indexes affected by the locks, and reorg will not update your statistics like an index rebuild will.

    That is absolutely a BAD recommendation. REORGANIZE is definitely an appropriate action for certain scenarios - quite a few of them actually.

    I agree with Kevin here. You should be doing statistics maintenance in addition to index maintenance anyway.

    That's a very broad claim with no qualifiers. Why should he be doing statistics maintenance in addition to index maintenance?

    My quoted information about REORGANIZE and its liabilities were taken from the Microsoft white paper: http://technet.microsoft.com/en-us/library/cc966523.aspx. In addition to that document providing extensive details on the performance and characterisitics of REBUILD and REORGANIZE, search on this phrase for a table that succinctly compares the two: "Table 5 Comparison of DBCC DBREINDEX to DBCC INDEXDEFRAG".

    If you know of virtues associated with REORGANIZE that Microsoft does not, I'd like to hear about them.

    LC

    Yes, that white paper does say that the specific tests show INDExFRAG running as much as 8 times longer, but you conveniently ignore this:

    The tests demonstrated that both DBCC DBREINDEX and DBCC INDEXDEFRAG can defragment indexes effectively and return the page density levels to near the original fillfactor of the table. Based on these results, your availability needs should determine which statement you should run.

    If you have a period of time to rebuild the indexes offline, DBCC DBREINDEX generally rebuilds indexes faster than DBCC INDEXDEFRAG. DBCC DBREINDEX takes full advantage of multiprocessor systems by creating the index in parallel across the available processors. DBCC INDEXDEFRAG is designed to be less intrusive to a production environment and has little impact to workload performance. The tests demonstrated that even when multiple concurrent streams were performing DBCC INDEXDEFRAG in parallel, the DSS workload performance was never affected by more than 10 percent. However, this design causes DBCC INDEXDEFRAG to take significantly longer to complete on larger indexes. In addition, the length of time it takes DBCC INDEXDEFRAG to defragment varies depending on load on the server at the time the statement is run.

    Defrag has to be slower because it takes fewer locks and is an online operation so you can live with the longer run time because availability is not affected, whereas, unless you are using Enterprise Edition, REINDEX/REBUILD does affect availability.

    Here are a couple of reasons to do statistics maintenance in addition to Index Maintenance:

    1. You can do Defrag instead of rebuild, particularly with indexes with lower fragmentation, and keep the affected tables/indexes available. You can then Update Statistics and gain the same performance improvements.

    2. By default, SQL Server creates statistics on columns that need to be maintained to allow the optimizer to make better choices when creating query plans.

    Now a quote from Paul Randal's Top Tips for Effective Database Maintenance:

    Analyze indexes and determine which indexes to operate on and how to do the fragmentation removal.

    For all indexes that were not rebuilt, update the statistics.

    Update statistics for all of the non-indexed columns

    Rebuilding indexes isn't always the best choice, reorganizing can be a valid option, and statistics maintenance is important.

    Edit: Thought I had put more in the white paper quote, but noticed it was missing so added it.

  • Defrag has to be slower because it takes fewer locks and is an online operation so you can live with the longer run time because availability is not affected, whereas, unless you are using Enterprise Edition, REINDEX/REBUILD does affect availability.

    Here are a couple of reasons to do statistics maintenance in addition to Index Maintenance:

    1. You can do Defrag instead of rebuild, particularly with indexes with lower fragmentation, and keep the affected tables/indexes available. You can then Update Statistics and gain the same performance improvements.

    2. By default, SQL Server creates statistics on columns that need to be maintained to allow the optimizer to make better choices when creating query plans.

    Now a quote from Paul Randal's Top Tips for Effective Database Maintenance:

    Analyze indexes and determine which indexes to operate on and how to do the fragmentation removal.

    For all indexes that were not rebuilt, update the statistics.

    Update statistics for all of the non-indexed columns

    Rebuilding indexes isn't always the best choice, reorganizing can be a valid option, and statistics maintenance is important.

    _______________

    It's difficult to accurately generalize from specifics but my experience with a very high load server performing OLTP and reporting (not a best practice) forced me to confront and make decisions about maintaining indexes and statistics.

    What I found is that the server could not tolerate the extra time to perform index reorganization. It overlapped too many other jobs that had to execute at specific times. And, since it did not address the issue of statistics maintenance, updating statistics added even more time to the entire maintenance process. Ultimately, the extended times to maintain indexes and statistics in this manner was, from an overall server performance standpoint, unacceptable. What was needed was an efficient, fast solution.

    I created a job that determines the fragmentation levels of all of the indexes in all of our databases, then selectively rebuilds those that exceed a certain threshold of fragmentation. This "rifle shot" approach is efficient in several ways:

    1. It does not waste server resources rebuilding (or reorganizing) indexes that do not need attention.

    2. It updates the statistics for the columns that comprise the indexes.

    3. It executes quickly and "gets the pain over with".

    I also experimented with selective statistics updates using the stored procedure "sp_updatestats". This supposedly selective statistics updating mechanism became more and more interfering with server performance as our databases grew and our online traffic and the reporting demands increased. After a considerable amount of experimentation, I completely eliminated the once-every-6-hour-execution of this stored procedure. There was no perceived or measurable benefit to its execution, only performance liabilities (Page Life Expectancy dropped considerably during its execution).

    Solution: Perform selective index rebuilding and simultaneously get the affected statistics updated as well. All things considered, this has by far been the best and least intrusive solution on our server.

    Regarding Paul Randall's quote: "Update statistics for all of the non-indexed columns", it would seem that there is a flaw in that statement, since statistics are only kept on search columns, which by definition, should be indexed, if you want to avoid table scans. I'm not sure I understand his recommendation.

    LC

  • crainlee2 (4/23/2011)


    Regarding Paul Randall's quote: "Update statistics for all of the non-indexed columns", it would seem that there is a flaw in that statement, since statistics are only kept on search columns, which by definition, should be indexed, if you want to avoid table scans. I'm not sure I understand his recommendation.

    He says 'update statistics', not create statistics. So, assuming that auto_create stats is on (which in general it should be), SQL may create stats on columns that are either not in indexes (because it's not worth creating indexes on them) or are not the leading column of an index (distribution stats are only kept on the leading column of an index) or, quite frankly, columns that should be indexed but aren't for some reason. It's good practice to keep those updated, otherwise it could lead to incorrect cardinality estimations and poor execution plans.

    If in your case you've determined that update stats is not necessary, that's fine. It's an educated and informed decision based on your environment. It is not, however, generalisable to all scenarios. Not all DBAs are able to make that decision, either because they're not familiar enough with the applications, don't have the skill or don't have the time to assess the situation. In those cases recommending blanket stats updates is better than the alternative.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • The point I was trying to make was that, as Gail has stated, you should not make blanket statements, like "never reorganize indexes" based on a specific experience. In your specific case, you have determined through much investigation and experimentation, that only rebuilds without statistics maintenance is necessary. In the environments in which I have worked, a combination of intelligent rebuilds and reorganization with statistics maintenance has been acceptable. I have found that there are very few absolutes when dealing with SQL Server because so often the best solution depends on many different factors.

  • Jack Corbett (4/23/2011)


    ...you should not make blanket statements...

    Heh... I've got a blanket statement... as always, "It Depends". 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • crainlee2 (4/23/2011)


    Regarding Paul Randall's quote: "Update statistics for all of the non-indexed columns", it would seem that there is a flaw in that statement, since statistics are only kept on search columns, which by definition, should be indexed, if you want to avoid table scans. I'm not sure I understand his recommendation.

    LC

    LC, I'll try to avoid the rest of this discussion and approach this in particular. The reason you'll have statistics on non-indexed columns is upkeep. A statistic, even with auto-update, will only need to be upkept after a certain thresh-hold (sp?). An index needs to be updated every IUD (ins/upd/del). A statistic is a nice mid-point, allowing for reasonable search expectations without constant interference in the IUD necessity, thus the recommendation. You usually use more statistics then indexes in an OLTP system. In OLAP, you'd index just about everything.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Jack Corbett (4/23/2011)


    I have found that there are very few absolutes when dealing with SQL Server because so often the best solution depends on many different factors.

    I have to agree with you based upon my own experiences. Every server environment is unique and requires at least some tailoring of upkeep and maintenance to be effective. As Jeff said, "It depends."

    I'm currently responsible for 2 servers, one with very high volumes of usage. This particular SQL Server environment has been one of the most challenging of my career. My employer has a problem with the current system implementation paradigm necessitating very high IUD for OLTP support and a very large number of indexes to support heavy reporting requirements. Fortunately, the server is very powerful and able to handle the current workload in spite of these conflicting requirements. Plus, the acquisition of a Compellent SAN produced additional resources to tolerate the conflict.

    After almost 2 years of explaining and demonstrating the conflicting requirements of OLTP and reporting environments to company management and the head of software engineering, I have finally convinced them that the next generation of application software must separate these 2 environments if the company's servers are going to continue to remain viable and adequately responsive to our clients needs. In the meantime, we're living with the conflict, and I have managed, by careful database maintenance, to keep our servers "ahead of the curve".

    Fundamentally, I agree with most of what all of you have said. I can envision situations where index reorganization and statistics maintenance can suffice. But, based upon my exhaustive testing, research, and experimentation, I am reluctant to use them when online index rebuilding is available as an option. That is why I generally recommend to others that they consider the virtues of online index rebuilding (if it is an option) instead of independent index reorganization and statistics updating operations.

    LC

  • 1) There are many places where microsoft recommendations are suboptimal, poor, inappropriate, dead wrong, etc.

    2) MANY resources you can find on the web are OUTDATED. This case is definitely one of those. If you go to Books Online for SQL Server 2005 you will find that DBCC DBREINDEX and DBCC INDEXDEFRAG are both DEPRECATED.

    3) I don't know you from Adam, but you are bucking heads with people I know personally to be VERY VERY qualified at interacting with the SQL Server Relational Engine. I highly recommend you start listening and paying attention instead of selectively quoting online research and going on about your own testing and experience. In the words of Bill Cosby "you might just learn something". 😎

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

Viewing 15 posts - 16 through 30 (of 34 total)

You must be logged in to reply to this topic. Login to reply