Memory Justification

  • I am seeing a couple of our SS 2005 servers with Page Life Expectanices commonly below 300 for an extended period of time. So to me that says I'm low on memory and could benefit almost for sure from adding more. My first question: do you agree?

    What I find puzzling is that my pages/sec is not really high and my buffer cache hit ratio is right near 100. So that means I'm not having a lot of paging, right? And if I'm not getting a lot of paging then I don't really need more memory, right?

    So I'm wondering if which is right: do I need more memory or not?

    Also, if you have any other good memory counters to share and thresholds, I'd appreciate it...

  • Memory paging has very little to do with how much memory SQL Server needs.

    Even with a high buffer cache hit ratio, if your Page Life Expectancy is low, you may need more memory.

  • Can you say a little more about this? I thought that if Sql Server ran out of memory, then it had to page? So if you're not getting paging, then you've got enough in memory, right?

    I'm sure that's wrong, but I'm hoping someone can tell me how I am thinking incorrectly.

    thx.

  • Whisper9999 (4/23/2010)

    ...I thought that if Sql Server ran out of memory, then it had to page? So if you're not getting paging, then you've got enough in memory, right?...

    To answer your questions, No and No

  • Well, if it's data pages, it has to page, right? Now if it's cache (for stored procs, etc.), then it can just flush some of those out. But doesn't it have to store those data pages somewhere and, if there's nothing in memory, it has to go to disk with it, right?

    Thx again.

  • bump.

    Hoping to get some more insight.

  • Here's more insight for you: Low PLE simply means pages of data are being removed from the buffer pool so 'fresher' pages can be put in for processing. If you have low PLE but high buffer cache hit ratio and users aren't complaining about slowness (and you aren't seeing lots of specific related waitstats such as PageIOLatch..) then what is happening is that queries are needing data that isn't in RAM and that data is being put into RAM sufficiently fast because your IO subsystem is keeping up with demand. That is how things are supposed to go.

    However, once your system load gets above a certain point your performance will take a serious nose dive when the IO system can't keep up because you are already at the point where RAM is a bottleneck.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • TheSQLGuru (4/27/2010)


    Here's more insight for you: Low PLE simply means pages of data are being removed from the buffer pool so 'fresher' pages can be put in for processing. If you have low PLE but high buffer cache hit ratio and users aren't complaining about slowness (and you aren't seeing lots of specific related waitstats such as PageIOLatch..) then what is happening is that queries are needing data that isn't in RAM and that data is being put into RAM sufficiently fast because your IO subsystem is keeping up with demand. That is how things are supposed to go.

    However, once your system load gets above a certain point your performance will take a serious nose dive when the IO system can't keep up because you are already at the point where RAM is a bottleneck.

    That makes sense. But then let's say I have a server with a PLE below 300 for literally hours in a day. What kind of buffer cache is a problem? The reason I'm asking is that I've read - can't remember where - that buffer cache below even 99.9 or 99.5 for an extended pd of time is a sign of a problem in a modern system. Would you agree with that?

  • Whisper9999 (4/27/2010)


    TheSQLGuru (4/27/2010)


    Here's more insight for you: Low PLE simply means pages of data are being removed from the buffer pool so 'fresher' pages can be put in for processing. If you have low PLE but high buffer cache hit ratio and users aren't complaining about slowness (and you aren't seeing lots of specific related waitstats such as PageIOLatch..) then what is happening is that queries are needing data that isn't in RAM and that data is being put into RAM sufficiently fast because your IO subsystem is keeping up with demand. That is how things are supposed to go.

    However, once your system load gets above a certain point your performance will take a serious nose dive when the IO system can't keep up because you are already at the point where RAM is a bottleneck.

    That makes sense. But then let's say I have a server with a PLE below 300 for literally hours in a day. What kind of buffer cache is a problem? The reason I'm asking is that I've read - can't remember where - that buffer cache below even 99.9 or 99.5 for an extended pd of time is a sign of a problem in a modern system. Would you agree with that?

    No, I would not agree with that. If users aren't complaining of slow performance then you probably have lots of more important things to be focusing on. 🙂 The thing is you are missing some more detailed pieces of information here, such as those waitstats and fileIO stall stats. Those will be better determinants of actual query performance to be honest with you. Search the web for this white paper if you want to go down the waitstats rabbit hole: SQL Server 2005 Waits and Queues Best Practices Article. Or you can hire a performance tuning professional to give your systems a review and mentor you in how to do the same after he/she leaves. Win-win-win

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • TheSQLGuru (4/27/2010)


    Whisper9999 (4/27/2010)


    TheSQLGuru (4/27/2010)


    Here's more insight for you: Low PLE simply means pages of data are being removed from the buffer pool so 'fresher' pages can be put in for processing. If you have low PLE but high buffer cache hit ratio and users aren't complaining about slowness (and you aren't seeing lots of specific related waitstats such as PageIOLatch..) then what is happening is that queries are needing data that isn't in RAM and that data is being put into RAM sufficiently fast because your IO subsystem is keeping up with demand. That is how things are supposed to go.

    However, once your system load gets above a certain point your performance will take a serious nose dive when the IO system can't keep up because you are already at the point where RAM is a bottleneck.

    That makes sense. But then let's say I have a server with a PLE below 300 for literally hours in a day. What kind of buffer cache is a problem? The reason I'm asking is that I've read - can't remember where - that buffer cache below even 99.9 or 99.5 for an extended pd of time is a sign of a problem in a modern system. Would you agree with that?

    No, I would not agree with that. If users aren't complaining of slow performance then you probably have lots of more important things to be focusing on. 🙂 The thing is you are missing some more detailed pieces of information here, such as those waitstats and fileIO stall stats. Those will be better determinants of actual query performance to be honest with you. Search the web for this white paper if you want to go down the waitstats rabbit hole: SQL Server 2005 Waits and Queues Best Practices Article. Or you can hire a performance tuning professional to give your systems a review and mentor you in how to do the same after he/she leaves. Win-win-win

    Thx for the tips.

  • You should confirm a memory issue by examining the following counters:

    SQLServer:Buffer Manager - Page Life Expectancy

    300 seconds or less may be indicative of low memory

    SQLServer:Buffer Manager - Buffer cache hit ratio

    Less than 98 percent of a hit ratio may indicate insufficient memory

    SQLServer:Buffer Manager- Stolen Pages

    High # relative to the total target pages (DBCC MEMORYSTATUS)

    SQLServer:Buffer Manager- Memory Grants Pending

    Zero and close to zero is best because then the server isn't queuing for memory.

    SQLServer:Buffer Manager- Checkpoint pages/sec

    More frequent checkpoints indicates low memory. If you see a high rate of checkpoints per second compared with normal (do you have a normal baseline?) then this is another indication of low memory

    SQLServer:Buffer Manager- Lazy writes/sec

    Zero and close to zero is best. If you're seeing this > 20 per second then the memory buffer pool isn't large enough

    So you need to validate a memory pressure situation using all of these counters and then you'll better be able to determine whether or not you're under a significant memory pressure.

    As someone suggested above, if you hire a performance guru they'll check your sql server (hardware, queries,etc...) and you'll learn a lot in the process.

    Steve

  • Don't confuse the virtual memory system and its paging file with the SQL buffer pages. Virtual memory pages are 4 KB pieces of the memory used by the SQL Server process (hopefully only code pages), while page buffer pages are 8KB chunks of a SQL Server database data file.

    Ideally the memory allocator for SQL Server is able to lock buffer pages in memory so they cannot be paged out by the operating system. The SQL Server process itself may be paged, but not the page buffer.

    When SQL recognizes memory pressure, it tries to release buffer pages. Buffer pages that have been modified must be written to their mdf file on disk before they can be released. If pages in the page buffer are paged out to the page file, they have to be paged back in from the page file so they can be written to the database file and released. Serious thrashing ensues.

  • Scott Coleman (4/28/2010)


    Don't confuse the virtual memory system and its paging file with the SQL buffer pages. Virtual memory pages are 4 KB pieces of the memory used by the SQL Server process (hopefully only code pages), while page buffer pages are 8KB chunks of a SQL Server database data file.

    Ideally the memory allocator for SQL Server is able to lock buffer pages in memory so they cannot be paged out by the operating system. The SQL Server process itself may be paged, but not the page buffer.

    When SQL recognizes memory pressure, it tries to release buffer pages. Buffer pages that have been modified must be written to their mdf file on disk before they can be released. If pages in the page buffer are paged out to the page file, they have to be paged back in from the page file so they can be written to the database file and released. Serious thrashing ensues.

    1) Locking pages in memory can be the CAUSE of problems and one should do investigations prior to doing this:

    http://support.microsoft.com/kb/918483

    “You should make additional considerations before you assign the ‘Lock pages in memory’ user right. If you assign this user right on systems that are configured incorrectly, the system may become unstable or experience a performance decrease of the whole system. Additionally, event ID 333 may be logged in the event log.”

    2) Does SQL Server actually recognize memory pressure? I don't think it does - I though it simply responds (sometimes) to memory pressure notifications from the OS. The SQL Server Internals books are a great resource for those who want to delve into this corner of the engine.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Guys,

    Thx for the many excellent comments.

  • According to the Microsoft engineers that wrote SQLOS (and were available for questioning at the PASS conference), SQL Server responds to memory pressure by recognizing when available physical RAM drops below a certain threshhold. It does not wait for notification from the OS.

    And while I would not be willing to give any other applications on a SQL Server system the right to lock pages in memory, I can't imagine running SQL Server on a 64-bit system without it. If the system is so "configured incorrectly" that it will be unstable if SQL Server tries to manage the page buffer memory allocation, you shouldn't be trying to run SQL Server on it in the first place.

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply