Buffer cache size much lower than Max Memory config + low PLE

  • I have a virtual server (VMware ESX) with 64GB RAM running a single instance of SQL 2012 SP1. The max memory config is set to 59392 (58GB).

    The Page Life Expectancy for this server has been averaging well under 10 mins for the last few days, according to our monitoring.

    I have been checking the amount of data in the buffer cache periodically during the day with the below query, which seems to show that there is never more than about 10GB of data at any one time, frequently dropping below 5GB:

    SELECT COUNT(*) AS BufferPages,

    CONVERT(decimal(10, 2), COUNT(*) / 128.0) AS BufferMB

    FROM sys.dm_os_buffer_descriptors

    Why would the amount of cached data be so low (and cause so much churn)?

    I am aware that other things will require some of that memory (plan cache etc.) but with Max Mem of 58GB, I would expect there to be a much higher amount of actual cached data at any one time. I did the same checks on another VM with the same amount of RAM/Max Mem setting, and there was 50GB of data in the cache, with PLE measured in hours.

    -----
    JL

  • James Lean (5/22/2014)


    I have a virtual server (VMware ESX) with 64GB RAM running a single instance of SQL 2012 SP1. The max memory config is set to 59392 (58GB).

    The Page Life Expectancy for this server has been averaging well under 10 mins for the last few days, according to our monitoring.

    I have been checking the amount of data in the buffer cache periodically during the day with the below query, which seems to show that there is never more than about 10GB of data at any one time, frequently dropping below 5GB:

    SELECT COUNT(*) AS BufferPages,

    CONVERT(decimal(10, 2), COUNT(*) / 128.0) AS BufferMB

    FROM sys.dm_os_buffer_descriptors

    Why would the amount of cached data be so low (and cause so much churn)?

    I am aware that other things will require some of that memory (plan cache etc.) but with Max Mem of 58GB, I would expect there to be a much higher amount of actual cached data at any one time. I did the same checks on another VM with the same amount of RAM/Max Mem setting, and there was 50GB of data in the cache, with PLE measured in hours.

    1) VMware could be misconfigured. I see this all the time.

    2) There were a number of memory-related bugs in SQL Server 2012 from their complete rewrite of the memory management system. Some of these were memory leaks. What patch level are you on?

    3) SQL Server could be misconfigured.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • What does the Total Server Memory counter look like?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Kevin,

    Build is 11.0.3000 - 2012 SP1.

    The VM is our standard built template, fairly standard config. If there are any specific configuration issues that you think might be a problem I can certainly double check? I did check the memory counters for this VM at the ESX level earlier today and confirmed there was no ballooning happening.

    Gail,

    Total/Target server memory are both at a constant 58GB.

    -----
    JL

  • ballooning, reservations, hard-force memory request.

    Build 3000 is a BAD place to be on SQL 2012. Windows installer can keep running and failing indefinitely, using resources. Worse is that while it is doing that your registry gets massively bloated, eventually hitting a limit and poof goes the machine. There are a BUNCH of important fixes released after SP1.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • TheSQLGuru (5/22/2014)


    Windows installer can keep running and failing indefinitely, using resources.

    I am aware of this issue but have never experienced it personally. I assume you'd see an instance of setup.exe (or similar msi process) in Task Manager on the server?

    Our usual policy is to stay at the latest service pack and not apply CUs unless there is a specific fix we need. I would need to gather more evidence of a specific issue before upgrading.

    The other query I have is from Glenn Berry's diagnostics, showing total memory clerk usage:

    SELECT TOP(10) [type] AS [Memory Clerk Type],

    SUM(pages_kb)/1024 AS [Memory Usage (MB)]

    FROM sys.dm_os_memory_clerks WITH (NOLOCK)

    GROUP BY [type]

    ORDER BY SUM(pages_kb) DESC OPTION (RECOMPILE);

    This is currently showing the following results, confirming the low buffer usage, but there doesn't appear to be anything else using up the available memory?

    CACHESTORE_SQLCP5424

    MEMORYCLERK_SQLBUFFERPOOL1196

    USERSTORE_TOKENPERM626

    CACHESTORE_OBJCP515

    OBJECTSTORE_LOCK_MANAGER390

    OBJECTSTORE_XACT_CACHE348

    MEMORYCLERK_SOSNODE323

    CACHESTORE_PHDR248

    MEMORYCLERK_SQLOPTIMIZER196

    MEMORYCLERK_SQLLOGPOOL148

    -----
    JL

  • Also, on the VM side, the ESX host has 260GB RAM. The total assigned RAM across all VMs is only 160GB. The performance charts show no ballooning is occurring.

    This particular VM has the default resource allocation options, i.e. normal shares, no reservation and no limit.

    -----
    JL

  • 1)

    ...

    Our usual policy is to stay at the latest service pack and not apply CUs unless there is a specific fix we need. I would need to gather more evidence of a specific issue before upgrading.

    Your company policy is doing you NO FAVORS! Microsoft does NOT issue Service Packs like it used to. Some builds are WAY outdated from an SP perspective! They truly have changed by and large to a frequent CU model for SQL Server. If you stick to that policy you need to review EVERY fix released and carefully evaluate if you are/might be exposed to it.

    2) Yes, you would see a process running regularly (trustedinstaller or some such maybe?) for the SP1 bug. You can also take a look at the affected registry hive size too.

    3) Have you reviewed dbcc memorystatus to see if anything stands out? Without access to your system I am still leaning towards some form of memory leak from some application - not necessarily SQL Server but that is probably the most likely culprit.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Kevin,

    Microsoft always used to state that one should only apply a particular CU if one or more issues addressed by that CU were being experienced...pretty much in line with the OP's policy. Have Microsoft officially changed their stance on CUs now?

    Regards

    Lempster

  • Lempster (5/23/2014)


    Kevin,

    Microsoft always used to state that one should only apply a particular CU if one or more issues addressed by that CU were being experienced...pretty much in line with the OP's policy. Have Microsoft officially changed their stance on CUs now?

    Regards

    Lempster

    No idea if they have changed their official policy. I deal with what reality provides me, and that is WAY stretched out SP releases. Think 5, 6, 700 or more days. 2008 And 2008R2 go out of support in July and it has been an unacceptably long time for both. There are just too damn many bug fixes when you are up to 10, 11, 12+ CUs. Besides, how many SPs have come out with bug still in play (or created new ones) - ALL of them. And some have been DISASTROUSLY bad ones.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • I think I've found the culprit!

    We have a database audit running against one of the databases on this server, monitoring access on a single table and writing results to a local audit file. Every 10 minutes a collection process runs on our monitoring server that gets the latest audit data using sys.fn_get_audit_file. Looking at the perfmon counters, I can see every time this runs the PLE and the Database Pages drop right down to almost 0.

    I disabled this collection job earlier and since then the used buffer space and PLE have both been steadily rising (buffer currently up to 30GB).

    Next question is why simply selecting this audit data causes the entire cache to be flushed??

    -----
    JL

  • It is either a bug or "by design" (that happens to be a bad design). Check Connect for similar issues. Post one if don't find it.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • I experienced this same issue a week ago. No ballooning, VMware was fine. The problem was an IO Fusion profiler that was sucking memory from the SQL server. PLE dropped and the performance went down really bad. I requested our VM admin to kill that thing and the problem went away.

    Kevin,

    You posted this:

    Your company policy is doing you NO FAVORS! Microsoft does NOT issue Service Packs like it used to. Some builds are WAY outdated from an SP perspective! They truly have changed by and large to a frequent CU model for SQL Server. If you stick to that policy you need to review EVERY fix released and carefully evaluate if you are/might be exposed to it.

    If you have any official MS link I would like to save it. Like others, our IT department tries to stay away from CU unless there is a compelling reason and problem that force us to apply one. It would be beneficial to know if MS officially acknowledge that so we can revise that policy.

  • Like I said - I have no knowledge of their official stance or policies regarding SP/CU releases. But they have moved to MUCH more rapid releases now with WASD being the first to receive all new stuff, and that stuff isn't necessarily getting SP-level testing and it is also taking up testing/QA resources that could be used for full-blown SP-level testing for new SPs for on-prem software builds. And all you really need to do is simply look at the gaps for each major build between SPs.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • TheSQLGuru (5/24/2014)


    Like I said - I have no knowledge of their official stance or policies regarding SP/CU releases. But they have moved to MUCH more rapid releases now with WASD being the first to receive all new stuff, and that stuff isn't necessarily getting SP-level testing and it is also taking up testing/QA resources that could be used for full-blown SP-level testing for new SPs for on-prem software builds. And all you really need to do is simply look at the gaps for each major build between SPs.

    Then I would be more careful about giving such recommendations to others. While CUs may fix some SQL server issues, they are not extensively tested as Service Packs are (confirmed by some MS engineers when I used to work for HP). In fact, they can introduce bugs that you did not have.

    I personally would not install a CU unless I am experiencing a known issue that was specifically addressed on that CU, and after doing some testing on my Dev environment, of course.

    I had a coworker a couple of years ago that installed a CU for SQL 2008 R2, not looking to fix anything in particular just to stay patched, and ended with a CPU utilization issue that involved MS and some downtime.

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply