Index fragmentation and SSDs

  • Index fragmentation is bad on traditional, spinning disk drives because the read head has to dance all around to gather the scattered data, slowing everything down. With solid state drives there are no moving parts and thus access is much, much faster. Would index fragmentation even be a problem with SSDs? Wouldn't data access be just as fast as unfragmented data?

  • Jonathan Kehayias did a benchmark that showed that SSDs were actually slower at random IO than sequential. But the difference is pretty small.

    But there are other reasons to be concerned about fragmentation, primarily increased IO and especially lower RAM efficiency due to lower average page full percentages. Additional locking overhead is a small increment.

    I note that there is a growing school of thought that we can essentially stop worrying about index fragmentation on most tables, even on rotating rust.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • How is IO increased?

  • Partially full pages, as a fragmented index will have more free space on pages than it needs (due to the page splits that caused the fragmentation. Hence reading more pages than necessary to get the same data, plus less efficient memory usage, larger database, etc.

    Plus fragmentation isn't a case of random access hence slower. It's a case of more IOs for the same data. SQL can read, as a single IO operation, up to 1024 pages (I think) of a table/index, providing they are contiguous. If they're not contiguous, then it'll need to do more IOs to get the same amount of data

    https://www.sqlskills.com/blogs/jonathan/does-index-fragmentation-matter-with-ssds/

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Brent Ozar put on several presentations talking about how much time is wasted by doing defrags. Yeah... I was highly skeptical and thought he was drinking bong water with 2 straws. Kendra Little had a slightly different but related post on how stats made more of a difference than indexes.

    Then, I ran into a problem. I'd repaired the index maintenance routines. They were missing a whole lot of indexes that I confirmed became "highly fragmented" over time. Some of these were on some hefty tables that get used all the time. But, every morning after the index maintenance ran, there would be a whole lot of timeouts and blocking, apparently because of the page splits that were occurring on the non-clustered indexes (the clustered indexes are based on a properly narrow, never changing, always increasing, unique IDENTITY column so weren't getting fragged). I didn't want to waste a whole bunch of space by using something like a FILL FACTOR of 80 or 90 and I didn't really want to take the time to figure those out, either.

    So, I did a test that started on Jan 18th, 2016, the day of blocking after the Sunday rebuilds. I turned off the index maintenance job to see what would happen. I was sure that the system would slow down after only a week. I did, however, keep the stats update job running.

    Being as nervous as a politician during a FOX sponsored debate, I checked performance every day. Oddly enough, after a week, the blocking had all but vanished and overall performance of the system increased and average CPU usage actually dropped a couple of percent as did disk IO (which dropped quite a bit). I checked database growth... it didn't "go vertical" or anything like that. In fact, it seemed rather normal (remember, the CIs aren't being heavily fragmented because of the IDENTITY keys many of the tables use).

    To make a much longer story shorter, I've not rebuilt/reorged indexes since that fateful day in January and, contrary to what I would have guess and what we think we all know, performance has only gotten better. It appears that the "natural" FILL FACTOR caused by page splits in the NCIs is just what the doctor ordered. Yep... it takes a little extra space but I'm amazed that average CPU usage across all 16 CPUs has dropped from an average of 18% to an average of about 9%. Disk IO (File Read and Writes) plummeted from an average of about 3.0MBytes/Sec to 0.6MBytes/Sec on a system that had no SSDs.

    As a set of side benefits, Point-in-Time Transaction log backups have gotten a lot smaller than they used to be during index maintenance. That leaves more time and resources available for nightly jobs even though I have the "expensive edition" and the maintenance job would do all rebuilds (those index that needed rebuilds) online.

    Ok... I'm not suggesting that you run right out and kill all of your index maintenance jobs. Taking my or Brent's or Kendra's advice to that extreme is foolish especially since each system and even each database can be so different than the ones I have. I AM, however, suggesting that it's worth the experiment. And don't even think about not updating stats. Make sure your stats job(s) continue to run.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I have done what you did (in doing no index mx and allowing the system to come to something approaching a "steady state fill factor" with just a couple of iterations of reviewing the output logs from Ola Hallengren's index maintenance script. It isn't difficult nor rocket science. I have done this at numerous clients over the years. And in doing that I get sequential IO benefits on any scans of indexes which you now never get. I also avoid that period where page splits are happening a lot and the much longer period where page density is slowing increasing up from 50%. And I don't get that "blizzard" of locking/blocking/latching that happens right after you rebuild everything back up to 100% full. Note that I only do this on indexes where the DML affects the range of values - not things like an ever-increasing date, orderid, etc.

    To each their own though. I am happy that your system is humming along and do not at all feel that this dramatic change in viewpoint doesn't have merit.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Unfortunately we are seeing the exact opposite. We accidentally altered our index maintenance job so since late March we've only been defragging about 20% of our indices. Since then the indices have just gotten more and more fragmented. Performance has degraded to the point where our wait numbers are climbing and we're getting a lot of application time outs. Of course we are now going to fix the index maint job and return to defragging all indices. FWIW, we are using Ola's procedures and are running on SSDs.

  • Mick Opalak (6/8/2016)


    Unfortunately we are seeing the exact opposite. We accidentally altered our index maintenance job so since late March we've only been defragging about 20% of our indices. Since then the indices have just gotten more and more fragmented. Performance has degraded to the point where our wait numbers are climbing and we're getting a lot of application time outs. Of course we are now going to fix the index maint job and return to defragging all indices. FWIW, we are using Ola's procedures and are running on SSDs.

    Is it because of the fragmentation or did you also stop rebuilding stats for the indexes you now skip? 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • No, we're still updating the stat several times per day.

  • Mick Opalak (6/8/2016)


    Unfortunately we are seeing the exact opposite. We accidentally altered our index maintenance job so since late March we've only been defragging about 20% of our indices. Since then the indices have just gotten more and more fragmented. Performance has degraded to the point where our wait numbers are climbing and we're getting a lot of application time outs. Of course we are now going to fix the index maint job and return to defragging all indices. FWIW, we are using Ola's procedures and are running on SSDs.

    This statement (combined with your updating stats multiple times per day) makes me feel that

    a) You are SCANNING indexes quite often. There are a wide range of factors that can cause said scans.

    b) You don't have sufficient memory and the reduced fill factor of the pages is leading to poor memory "effectiveness".

    In any case, I would want proof about the EXACT cause of the timeouts. If you aren't running sp_whoisactive during extended blocking or a blocked process capture then do you REALLY know??

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Memory = 2TB

    Fill Factor = 75 because our system (OLTP) is so active (3500 transactions at any given time) that indices get highly fragmented (like over 90%) daily.

  • Mick Opalak (6/8/2016)


    Memory = 2TB

    Fill Factor = 75 because our system (OLTP) is so active (3500 transactions at any given time) that indices get highly fragmented (like over 90%) daily.

    1) (Nonsequential) GUIDs perhaps?

    2) If you are getting 90% fragmented daily then the DBA isn't doing his/her job - period. Fill Factors should have LONG AGO been adjusted to stop that. It's just plain silly to have that happen. Excuse my bluntness but there is just no other way to put it.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply