December 29, 2008 at 1:13 pm
My mind is mush after reading numerous papers on monitoring memory. Most of what I found was in regards to what DMVs and perfmon/sysmon counters to check, but very little information on what values are considered bad. So I ask you, what do you check when you suspect a memory bottleneck and does this apply to 64-bit OS? I've always concentrated on paging, but I know that is too simplistic.
Thanks, Dave
December 29, 2008 at 9:09 pm
Dave,
First, what is your servers hardware and software configuration? I see you say it is x64, but how much memory does it have, is it a dedicated SQL server, what version OS is running on it? Do you have the Max Server Memory set for SQL?
Then what is making you think that you have a memory bottleneck? The counter values for x64 are basically the same for x86 and IA64 as to what you should be monitoring and what should be considered acceptable. Have a look at the PAL tool on codeplex:
It can be used to read your counter sets and point out possible problem areas. There is a SQL Server specific template that has the counters and threshold limits defined in it. This was developed by Microsoft so I trust the values contained in its defaults as good baseline numbers.
Jonathan Kehayias | Principal Consultant | MCM: SQL Server 2008
My Blog | Twitter | MVP Profile
Training | Consulting | Become a SQLskills Insider
Troubleshooting SQL Server: A Guide for Accidental DBAs[/url]
December 30, 2008 at 2:25 pm
SQL Server is clustered. Each server contains (4) dual core CPUs with 32GB of memory and a GBit NIC. The OS is Windows Server 2003 EE SP2. Pagefile is 10GB. The server is dedicated to SQL, however it does contain the following common applications.
TREND - Anti-virus (not real-time)
CSA - Cisco Security Agent
Double-Take - Replication to Disaster Recovery server
Min Server Memory: 2GB
Max Server Memory: 26GB
I believe a memory bottleneck may exist because there are a lot of sustained page reads and writes against the pagefile. I will take a look at the PAL tool, but I'm still curious what perfmon/sysmon counters I should concentrate on and more importantly what values indicate a memory bottleneck.
In regards to the Pagefile I've always believed you don't need a large pagefile, because with SQL Server if you are seeing a lot of paging and your pagefile grows to several GB you most likely need to add memory. I've had Microsoft engineer's tell me the same thing indicating the only reason I would want a large pagefile is to capture a memory dump. However, I did find KB 889654 which has the following quote.
Note If you are running Microsoft SQL Server together with MSDTC (COM+), the pagefile should be at least 1 percent larger than how much RAM is installed in the system. For example, when you are using 32 GB of RAM, the pagefile should be at least 32.32 GB of RAM.
Thanks, Dave
December 30, 2008 at 2:46 pm
What is the average value of the Memory: Available MB's on the server? If it is higher than 150MB then I don't think that you have a Memory bottleneck so much, but that is not to say that you don't have a problem necessarily. I'd have to ask some other MVP's regarding the page file activity that you are seeing specifically to find out more information as this isn't an area that I am overly familiar with, especially with clustering involved. My page files are always 1.5 times the physical memory as configured by our Server Team, I don't change what they set on my servers. Someone else on here probably can answer this as well, but in case, I'll send an email to a few people and see what they say about it.
Jonathan Kehayias | Principal Consultant | MCM: SQL Server 2008
My Blog | Twitter | MVP Profile
Training | Consulting | Become a SQLskills Insider
Troubleshooting SQL Server: A Guide for Accidental DBAs[/url]
December 30, 2008 at 2:48 pm
One other thing:
Is there a specific performance issue that you are having, or are you looking to understand why the counters for the page file are what they are information wise?
Jonathan Kehayias | Principal Consultant | MCM: SQL Server 2008
My Blog | Twitter | MVP Profile
Training | Consulting | Become a SQLskills Insider
Troubleshooting SQL Server: A Guide for Accidental DBAs[/url]
December 30, 2008 at 3:49 pm
Available MBytes is 2,622. As I am typing this the pagefile use showed Page Ins up to 10k, although these are spikes that last typically a few seconds and then spike again after a few seconds. Pages found in RAM is currentlly 65% and Read Hit Ratio is 0%.
There are loads of performance problems we are troubleshooting, most caused by poor code. The environment won't be live for another 6 weeks, but I see constant CXPACKET and PageLatch wait times. High disk queuing and Disk Sec/write, indicating a disk bottleneck on our SAN and lately I have seen constant read/write to the pagefile.
December 30, 2008 at 4:00 pm
DBADave (12/30/2008)
TREND - Anti-virus (not real-time)CSA - Cisco Security Agent
Double-Take - Replication to Disaster Recovery server
great, bound to be a few memory leak issues with that little lot installed.
you dont mention the version of sql server, IIRC the problems you are experiencing are common with SQL2005 Standard x64
-----------------------------------------------------------------------------------------------------------
"Ya can't make an omelette without breaking just a few eggs" 😉
December 30, 2008 at 4:13 pm
Dave,
When identifying memory as a bottleneck, I'd first want to confirm that memory pressure is occurring within the SQL Server engine. Memory pressure is when SQL Server needs to contend with one or more processes for system memory.
A couple performance monitor counters I use when trying to identify memory pressure are located within SQLServer:Buffer Manager.
Buffer cache hit ratio - This is the percentage of requested pages found in memory without having to incur a read from disk. You want this to be above 95%, and in most cases in the 98-99.9% area.
Page Life Expectancy - This is how long a page remains in memory before getting paged out. The longer the better. If this value is very low it can indicate memory pressure. Ideally you want this value to be above 300 (5 minutes).
Both of these counters having low values indicates memory pressure on the system, which is commonly caused by not having enough memory, other applications consuming memory, or an incorrect SQL Server memory setting.
Additionally, what edition of SQL Server are you using?
December 30, 2008 at 5:09 pm
I'm running SQL 2005 Enterprise - 64bit - SP2
December 30, 2008 at 11:04 pm
DBADave (12/29/2008)
My mind is mush after reading numerous papers on monitoring memory. Most of what I found was in regards to what DMVs and perfmon/sysmon counters to check, but very little information on what values are considered bad. So I ask you, what do you check when you suspect a memory bottleneck and does this apply to 64-bit OS? I've always concentrated on paging, but I know that is too simplistic.Thanks, Dave
use following script :
DBCC MEMORYSTATUS
------------------
--Host wise
select hostname,count(spid) as Total_Connection,sum(isnull(memusage,0)) as MemUsage from sys.sysprocesses
group by hostname order by count(spid) desc
----------
--Database wise
select db_name(dbid) as DatabaseName,count(spid) as Total_Connection,sum(isnull(memusage,0)) as MemUsage from sys.sysprocesses
group by db_name(dbid) order by count(spid) desc
--------
select
a.name,
connections = (select
count(*)
from
master..sysprocesses b
where
a.dbid = b.dbid),
blocked_users = (select
count(*)
from
master..sysprocesses b
where
a.dbid = b.dbid and
blocked <> 0),
total_memory = isnull((select sum(memusage)
from
master..sysprocesses b
where
a.dbid = b.dbid),0),
total_io = isnull((select sum(physical_io)
from
master..sysprocesses b
where
a.dbid = b.dbid),0),
total_cpu = isnull((select sum(cpu)
from
master..sysprocesses b
where
a.dbid = b.dbid),0),
total_waittime = isnull((select sum(waittime)
from
master..sysprocesses b
where
a.dbid = b.dbid),0),
dbccs = isnull((select count(*)
from
master..sysprocesses b
where
a.dbid = b.dbid and
upper(b.cmd) like '%DBCC%'),0),
bcp_running = isnull((select count(*)
from
master..sysprocesses b
where
a.dbid = b.dbid and
upper(b.cmd) like '%BCP%'),0),
backup_restore_running = isnull((select count(*)
from
master..sysprocesses b
where
a.dbid = b.dbid and
upper(b.cmd) like '%BACKUP%' or
upper(b.cmd) like '%RESTORE%'),0)
from
master.dbo.sysdatabases a
-----------------------------------
select DB = a.instance_name,
'DBCC Logical Scans' = a.cntr_value,
'Transactions/sec' = (select d.cntr_value
from
master..sysperfinfo d
where
d.object_name = a.object_name and
d.instance_name = a.instance_name and
d.counter_name = 'Transactions/sec'),
'Active Transactions' = (select case when i.cntr_value < 0 then 0
else i.cntr_value end
from
master..sysperfinfo i
where
i.object_name = a.object_name and
i.instance_name = a.instance_name and
i.counter_name = 'Active Transactions'),
'Bulk Copy Rows' = (select b.cntr_value
from
master..sysperfinfo b
where
b.object_name = a.object_name and
b.instance_name = a.instance_name and
b.counter_name = 'Bulk Copy Rows/sec'),
'Bulk Copy Throughput'= (select c.cntr_value
from
master..sysperfinfo c
where
c.object_name = a.object_name and
c.instance_name = a.instance_name and
c.counter_name = 'Bulk Copy Throughput/sec'),
'Log Cache Reads' = (select e.cntr_value
from
master..sysperfinfo e
where
e.object_name = a.object_name and
e.instance_name = a.instance_name and
e.counter_name = 'Log Cache Reads/sec'),
'Log Flushes' = (select f.cntr_value
from
master..sysperfinfo f
where
f.object_name = a.object_name and
f.instance_name = a.instance_name and
f.counter_name = 'Log Flushes/sec'),
'Log Growths' = (select g.cntr_value
from
master..sysperfinfo g
where
g.object_name = a.object_name and
g.instance_name = a.instance_name and
g.counter_name = 'Log Growths'),
'Log Shrinks' = (select h.cntr_value
from
master..sysperfinfo h
where
h.object_name = a.object_name and
h.instance_name = a.instance_name and
h.counter_name = 'Log Shrinks')
from
master..sysperfinfo a
where
a.object_name like '%Databases%'
December 30, 2008 at 11:13 pm
Thanks Paresh, but what values are considered bad and point to a memory bottleneck.
Dave
December 31, 2008 at 9:17 am
A couple of observations. That KB article is a bit strange. A well-tuned SQL Server should not page significantly. You may see some ticks in the paging counters, but that is because the backup APIs look like paging to the OS. Try it sometime and you will see the paging counters increment while a SQL backup runs.
Strictly speaking, you can build a cluster without MSDTC, but you won't get very far. It is almost impossible to install a SQL service pack without MSDTC, so everyone just adds it in anyway.
Note that Windows 2008 has some new options on MSDTC and is a much better platform for clustering SQL than Windows 2003.
Personally, I point to DoubleTake. Even though the block-level replication system used by DoubleTake is asynchronous, you still have extra I/O operations and of course memory pressure.
So, on to solutions. A larger paging file won't help. Paging cached data is self-defeating. You may need to lower the upper memory boundary on SQL an d see if that helps. Keep an eye on Page Life Expectancy so you don't squeeze cache too much. Consider rebuilding the cluster on Windows 2008 as it handles complex, large memory configurations much better than Windows 2003. Note that you cannot migrate a Windows 2003 cluster to Windows 2008 in-place. There are too many differences.
December 31, 2008 at 9:31 am
I also suspect Double-Take and have been lowering the max memory, but I'm still hopeful to find some memory counters that I can look at on a regular basis to assess memory performance. If you have any counter recommendations along with their corresponding good and bad values that would be appreciated.
Thanks
April 20, 2009 at 1:57 am
"but I see constant CXPACKET and PageLatch wait times"
...check the paralellism settings. Maybe think about changing it to '1'.
CXPACKET is an indication that the server is waiting for queries that have been broken up to run in parallel on different CPUs/Cores
Viewing 14 posts - 1 through 13 (of 13 total)
You must be logged in to reply to this topic. Login to reply