August 20, 2009 at 10:45 am
Hi,
We have a/a/p cluster setup. Whenever, failover occurs, I'm noticing the below message in the error log.
A significant part of sql server process memory has been paged out. This may result in a performance degradation. Duration: 0 seconds. Working set (KB): 38860, committed (KB): 105728, memory utilization: 36%.
and also getting an alarm from Spot light monitoring tool as below:
TimeConnectionActionDetailsSeverityAlarm
08/19/2009 2:11:10 PMABCAlarm raisedThe buffer cache page life expectancy is 10 seconds.HighPage Life Expectancy Alarm
please advice..
August 20, 2009 at 11:14 am
How many SQL Instances are we talking and what nodes are you failing to/from?
your PLE goes down because the OS is taking memory back from SQL Server.
This happens when the OS experiences memory pressure regardless of whether or not "Lock Pages in Memory" is set.
I think you need more memory to support the specific failover pattern that causes the issue.
~BOT
Craig Outcalt
August 20, 2009 at 11:37 am
Thank You,
How many SQL Instances are we talking and what nodes are you failing to/from?
Node1(active):
Memeory: 16 GB
no.of instances: 1 (sql server 2005 EE x64 with SP3)
Max memory set to: 12 GB and 4 GB left for OS
Min memory set to: default
lock pages in memory: not set
Node 2(Passive):
Memeory: 16 GB
Node 3(Active):
Memeory: 16 GB
no.of instances: 4 (sql server 2005 EE x64 with SP3)
Max memory set to: 3 GB for each instance and 4 GB left for OS
Min memory set to: default
lock pages in memory: not set
When Node 1 fail over to Node 2, or node 2 fail over to node 1,we are getting the below messages:
A significant part of sql server process memory has been paged out. This may result in a performance degradation. Duration: 0 seconds. Working set (KB): 38860, committed (KB): 105728, memory utilization: 36%.
and also getting an alarm from Spot light monitoring tool as below:
Time Connection Action Details Severity Alarm
08/19/2009 2:11:10 PM ABC Alarm raised The buffer cache page life expectancy is 10 seconds. High Page Life Expectancy Alarm
thank you
kln
August 20, 2009 at 11:45 am
I don't understand how node 2 can be failed over.
Did you mean when nodes 1 or 3 fail to node 2?
Also... is anything else running on node2 when it gets the errors?
~BOT
Craig Outcalt
August 20, 2009 at 1:51 pm
I also seen these errors while moving resources to different nodes.
August 20, 2009 at 1:54 pm
Ok, we had some issue with hardware on node 1 and all the resources were moved to node2,then I did get page life expectancy alarm form spot light and Memory has paged out message in error log. After fixing the hardware issue, I have moved the resources back to node1 and again I got same alarm & memory paged out message in error log while moving the resources.
thanks
August 20, 2009 at 2:44 pm
i would set the lock pages in memory policy for the sql server service accounts and configure the min and max memory setting values
-----------------------------------------------------------------------------------------------------------
"Ya can't make an omelette without breaking just a few eggs" 😉
August 20, 2009 at 3:00 pm
besides lock pages in memory, you should look at the OS to see what is stealing memory from SQL Server.
From what I understand the The SQLOS memory clerks will release memory to the OS from the sql internal memory when it's external memory areas are under pressure (VAS, Worker threads, MPA, etc.)... regardless of 'lock pages in memory' setting.
So when the OS is under memory pressure, the SQL External memory areas are under pressure. The SQLOS memory clerks will then release internal memory back to the OS. The OS will give that memory to whatever asks for it. Maybe SQLOS, maybe SQL internal, maybe Windows Media player.
The OS isn't taking it, though... SQLOS is giving it. Make sense?
~BOT
Craig Outcalt
August 20, 2009 at 3:04 pm
antivirus client installed scanning the SQL server folders by any chance?
-----------------------------------------------------------------------------------------------------------
"Ya can't make an omelette without breaking just a few eggs" 😉
August 20, 2009 at 3:22 pm
We have Anti-virus installed on all nodes though Microsoft recommends not to install Anti-Virus on clusters. How can I prove, Anti-Virus or some other OS related thing is causing Memory pressure?
August 20, 2009 at 4:06 pm
what antivirus software do you have installed?
-----------------------------------------------------------------------------------------------------------
"Ya can't make an omelette without breaking just a few eggs" 😉
August 20, 2009 at 4:38 pm
Anti-Virus software is McAfee
August 21, 2009 at 11:40 am
yeah, in my experience McAfee software has some nasty memory leak issues. Get your AV version and go to the McAfee site and check for any fixes. Also stop the AV from scanning the SQL server data file and exe locations
-----------------------------------------------------------------------------------------------------------
"Ya can't make an omelette without breaking just a few eggs" 😉
Viewing 13 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply