Massive IOs!

  • Hi,

    I was contacted by our SAN administrator about massive IOPS from a particular server. This server was attributing up to 50% (FIFTY!!) of the SAN activity. The admin was witnessing several large spikes of IO activity. This particular server was extremely misconfigured when it was put into production before I got here. Its running on a 32bit environment with 4gb of ram. I would expect to see memory buffer pool drop below 90% very often on this server but I have not witnessed this.

    After running initial traces I located a stored procedure that's executed from upwards of 50 workstations every minute. The seemed the likely culprit, so I did some testing against a test environment and saw 9 scans and logical reads of about 18500. (Test environment is very different than the prod environment). The Execution plan suggested implementing a missing index. After implementing the missing index, the scans dropped to 2 and logical reads dropped to 10. We put this into production with the hopes of solving the issue, but sadly no changes were witnessed.

    Back to the drawing board. I closely monitored this system for spikes in activity. I noticed quite regularly that synchronous reads and read ahead reads would greatly spike. Upwards of the 40,000 to even 80,000 I believe. BOL states read ahead reads:

    Read-ahead anticipates the data and index pages needed to fulfill a query execution plan and brings the pages into the buffer cache before they are actually used by the query.

    This led me to consider that if read ahead reads are that high, that pages being brought into the memory buffer, maybe they're being flushed quickly too. So I queried page life expectancy with:

    SELECT [object_name],

    [counter_name],

    [cntr_value]

    FROM sys.dm_os_performance_counters

    WHERE [object_name] LIKE '%Manager%'

    AND [counter_name] = 'Page life expectancy'

    I have seen varying results from 0 to 30 seconds. During spikes PLE doesn't get much higher than 4. I read that you should be expecting 300 on a properly tuned system.

    Is my thought process correct here? Would these read aheads be the culprit of massive IO's. What can be done about low PLE's on this system? Is there anything index wise or query tuning we can accomplish this? Is more RAM the only option?

    Edit: as mentioned, the buffer cache doesn't look abnormal. The plan cache hovers between 35 and 50%.

  • #1 get a copy of prod to your dev/qa machine.

    #2 http://www.sqlskills.com/BLOGS/PAUL/post/Page-Life-Expectancy-isnt-what-you-think.aspx

    #3 Keep tuning with this. Group & Sort by highest reads / writes and tune the worst offenders untill the problem is solved.

    http://www.simple-talk.com/sql/performance/finding-the-causes-of-poor-performance-in-sql-server,-part-1

    http://www.simple-talk.com/sql/performance/finding-the-causes-of-poor-performance-in-sql-server,-part-2

    #4 If this were a 64 bit server I'd suggest maxing out the ram so you go less to disk as ram is cheap this days. But this would require a majog upgrade for you so it's not really a quick fix.

  • Add memory (move to 64 bit if possible, if not make sure AWE is enabled)

    Tune queries. Missing indexes are suggestions, not directives, test on a representative data volume, tuning indexes on a cut-down database is an exercise in frustration.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • thanks. I followed Gail's performance articles when initially researching this issue and grouped by highest total. I tuned the highest offender with the missing index but it unfortunately did nothing to help physical IO witnessed by the SAN admin. I ran another trace after implementing the index but haven't evaluated it yet to compare results.

    I will read the link you sent about NUMA

  • jshurak (10/26/2011)


    thanks. I followed Gail's performance articles when initially researching this issue and grouped by highest total. I tuned the highest offender with the missing index but it unfortunately did nothing to help physical IO witnessed by the SAN admin. I ran another trace after implementing the index but haven't evaluated it yet to compare results.

    I will read the link you sent about NUMA

    It's not really about numa, it's about the figure itself and why should be used now to detect a problem.

    As Gail said, tuning on a <massively> cut down version of the db is just a waste of time.

    Give it another go to tune on a copy of your latest backup of prod locally and see if those changes help.

    Worse comes to worse I can come in for a quick consult to try and figure it out for you.

  • Ninja's_RGR'us (10/26/2011)


    Worse comes to worst I can come in for a quick consult to try and figure it out for you.

    Or I can do a remote perf review, or TheSQLGuru can fly in, or... Lots of us do tuning here.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (10/26/2011)


    Ninja's_RGR'us (10/26/2011)


    Worse comes to worst I can come in for a quick consult to try and figure it out for you.

    Or I can do a remote perf review, or TheSQLGuru can fly in, or... Lots of us do tuning here.

    Well you did write those articles so I figured he could think of you by himself :Whistling:.

  • To the OP: you can spend days or even weeks trying to hunt down performance issues if you aren't trained/skilled/experienced at doing so. Or you can do as others have suggested and get a consultant in to both do a performance review and mentor you on how to do the same. HUGE ROI there if you get the right person.

    To Ninja: I know rocket-science smart people who have NO BUSINESS interacting with actual clients in a consulting gig. So just writing an article doesn't qualify someone to be a consultant. I note that Gail is NOT one of those people, btw! LOL

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • JOKE =>

    Don't you love the smell of backstabbing in the morning? 🙂

    Note to the <non hiring> op. Both these 2 seem to love using knives while I don't need it, so I guess it's a big plus for me.

    :hehe:

    (Gail is a blackbelt and Kevin is trying to pull ahead in the non race to get this consulting gig).

  • Ninja's_RGR'us (10/26/2011)


    GilaMonster (10/26/2011)


    Ninja's_RGR'us (10/26/2011)


    Worse comes to worst I can come in for a quick consult to try and figure it out for you.

    Or I can do a remote perf review, or TheSQLGuru can fly in, or... Lots of us do tuning here.

    Well you did write those articles so I figured he could think of you by himself :Whistling:.

    Someone having written an article doesn't automatically mean they're available as a consultant. When I wrote those I was working full time and could not have done any consulting work.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Ninja's_RGR'us (10/27/2011)


    Don't you love the smell of backstabbing in the morning? 🙂

    Note to the <non hiring> op. Both these 2 seem to love using knives while I don't need it, so I guess it's a big plus for me.

    I can't tell if you're accusing Kevin of backstabbing, or me, or both, but regardless that's uncalled for.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (10/27/2011)


    Ninja's_RGR'us (10/27/2011)


    Don't you love the smell of backstabbing in the morning? 🙂

    Note to the <non hiring> op. Both these 2 seem to love using knives while I don't need it, so I guess it's a big plus for me.

    I can't tell if you're accusing Kevin of backstabbing, or me, or both, but regardless that's uncalled for.

    Not my intention and maybe too much of an inside joke.

    So I guess smiley faces not enough here?

  • Ninja's_RGR'us (10/27/2011)


    GilaMonster (10/27/2011)


    Ninja's_RGR'us (10/27/2011)


    Don't you love the smell of backstabbing in the morning? 🙂

    Note to the <non hiring> op. Both these 2 seem to love using knives while I don't need it, so I guess it's a big plus for me.

    I can't tell if you're accusing Kevin of backstabbing, or me, or both, but regardless that's uncalled for.

    Not my intention and maybe too much of an inside joke.

    So I guess smiley faces not enough here?

    Given the smiley I would call if more of an outside joke. 😛

    In any case I am not actually "in the running" for this non-job consulting gig either. I am (very thankfully) over-booked for months out. 😎 Which is actually a shame. I absolutely LOVE doing this type of stuff and I am certain it would be like every single other similar perf review I have done in 15 years: shooting fish in a barrel! 🙂 Everyone, no matter whether Fortune 100 or mom-and-pop corner store, does most of the same "suboptimal" stuff!!!

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • TheSQLGuru (10/27/2011)


    Ninja's_RGR'us (10/27/2011)


    GilaMonster (10/27/2011)


    Ninja's_RGR'us (10/27/2011)


    Don't you love the smell of backstabbing in the morning? 🙂

    Note to the <non hiring> op. Both these 2 seem to love using knives while I don't need it, so I guess it's a big plus for me.

    I can't tell if you're accusing Kevin of backstabbing, or me, or both, but regardless that's uncalled for.

    Not my intention and maybe too much of an inside joke.

    So I guess smiley faces not enough here?

    Given the smiley I would call if more of an outside joke. 😛

    In any case I am not actually "in the running" for this non-job consulting gig either. I am (very thankfully) over-booked for months out. 😎 Which is actually a shame. I absolutely LOVE doing this type of stuff and I am certain it would be like every single other similar perf review I have done in 15 years: shooting fish in a barrel! 🙂 Everyone, no matter whether Fortune 100 or mom-and-pop corner store, does most of the same "suboptimal" stuff!!!

    As I said earlier Keven, I'm available for some overflow if it can be worked out (apparently you don't get my pms ;-))

  • TheSQLGuru (10/27/2011)


    In any case I am not actually "in the running" for this non-job consulting gig either. I am (very thankfully) over-booked for months out.

    I wish I was. Most companies are starting to wind down for the year. Probably won't see much until Feb or so.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 15 posts - 1 through 15 (of 27 total)

You must be logged in to reply to this topic. Login to reply