November 25, 2008 at 11:29 am
This is where I could use some Advanced expertise
Problem
Every 30~40 minutes, the SQL server will generate an error (in the error log) for memory paged out
It has happened for couple weeks, on this Production server
A significant part of sql server process memory has been paged out. This may result in a performance degradation. Duration: 600 seconds. Working set (KB): 8271680, committed (KB): 20581080, memory utilization: 40%.
Notice below at 12:34 it resets to 0%, and 12:54 1% again (that's only 20 minutes duration)
and Memory Utilization % never goes beyond 51% ever.
Plus, there is still 13GB+ available Physical Memory to use
Notification_timeNotification_typeMemoryUtilization %TotalPhysicalMemory_MBAvailablePhysicalMemory_MB
2008-11-25 12:54:40.363RESOURCE_MEM_STEADY32457013221
2008-11-25 12:54:35.067RESOURCE_MEMPHYSICAL_LOW12457013191
2008-11-25 12:42:53.553RESOURCE_MEM_STEADY482457012854
2008-11-25 12:42:48.537RESOURCE_MEMPHYSICAL_LOW472457012868
2008-11-25 12:41:47.977RESOURCE_MEM_STEADY462457012902
2008-11-25 12:41:42.960RESOURCE_MEMPHYSICAL_LOW462457012926
2008-11-25 12:40:42.397RESOURCE_MEM_STEADY412457013490
2008-11-25 12:40:42.397RESOURCE_MEMPHYSICAL_HIGH412457013490
2008-11-25 12:40:37.367RESOURCE_MEMPHYSICAL_LOW412457013638
2008-11-25 12:39:36.803RESOURCE_MEM_STEADY342457014567
2008-11-25 12:39:31.790RESOURCE_MEMPHYSICAL_LOW342457014624
2008-11-25 12:38:31.227RESOURCE_MEM_STEADY322457014552
2008-11-25 12:38:31.227RESOURCE_MEMPHYSICAL_HIGH322457014552
2008-11-25 12:38:26.147RESOURCE_MEMPHYSICAL_LOW312457014502
2008-11-25 12:37:25.587RESOURCE_MEM_STEADY282457014229
2008-11-25 12:37:20.163RESOURCE_MEMPHYSICAL_LOW282457014146
2008-11-25 12:36:19.603RESOURCE_MEM_STEADY252457013296
2008-11-25 12:36:19.603RESOURCE_MEMPHYSICAL_HIGH252457013296
2008-11-25 12:36:14.570RESOURCE_MEMPHYSICAL_LOW252457013234
2008-11-25 12:35:14.010RESOURCE_MEM_STEADY152457013865
2008-11-25 12:35:08.913RESOURCE_MEMPHYSICAL_LOW152457013822
2008-11-25 12:34:08.353RESOURCE_MEM_STEADY122457013248
2008-11-25 12:34:03.337RESOURCE_MEMPHYSICAL_LOW02457013218
Error Logs
25/11/2008 13:00:08 A significant part of sql server process memory has been paged out. This may result in a performance degradation. Duration: 333 seconds. Working set (KB): 5653268, committed (KB): 20581208, memory utilization: 27%.
25/11/2008 12:54:35 A significant part of sql server process memory has been paged out. This may result in a performance degradation. Duration: 0 seconds. Working set (KB): 248476, committed (KB): 20588888, memory utilization: 1%.
25/11/2008 12:39:31 A significant part of sql server process memory has been paged out. This may result in a performance degradation. Duration: 328 seconds. Working set (KB): 6949908, committed (KB): 20382232, memory utilization: 34%.
25/11/2008 12:34:03 A significant part of sql server process memory has been paged out. This may result in a performance degradation. Duration: 0 seconds. Working set (KB): 98844, committed (KB): 20384152, memory utilization: 0%.
25/11/2008 12:07:53 A significant part of sql server process memory has been paged out. This may result in a performance degradation. Duration: 327 seconds. Working set (KB): 7795496, committed (KB): 20382232, memory utilization: 38%.
Server
Microsoft SQL Server 2005 - 9.00.3042.00 (X64) Feb 10 2007 00:59:02
Standard Edition (64-bit) on Windows NT 5.2 (Build 3790: Service Pack 2)
Windows Server 2003 Enterprise x64 SP2
Intel Xeon E5310 @ 1.60 GHz
Processors 8
24GB Physical RAM
Memory 24571 (MB)
Min memory 19456 (MB)
Max memory 19456 (MB)
Advanced settings is set to "Background Service"
Total paging file size for all drives: 20480MB
Disk drive spaces are plenty for DATA, LOG, TempDB drives
Tried
new Broadcome NIC drivers
SSC
From Microsoft - How to reduce paging of buffer pool memory in the 64-bit version of SQL Server 2005
SELECT
CONVERT (varchar(30), GETDATE(), 121) as runtime,
--DATEADD (ms, -1 * ((sys.cpu_ticks / sys.cpu_ticks) - a.[Record Time]), GETDATE()) AS Notification_time,-- 2008
DATEADD (ms, -1 * ((sys.cpu_ticks / sys.cpu_ticks_in_ms) - a.[Record Time]), GETDATE()) AS Notification_time, -- 2005
a.* , sys.ms_ticks AS [Current Time]
FROM
(SELECT x.value('(//Record/ResourceMonitor/Notification)[1]', 'varchar(30)') AS [Notification_type],
x.value('(//Record/MemoryRecord/MemoryUtilization)[1]', 'bigint') AS [MemoryUtilization %],
x.value('(//Record/MemoryRecord/TotalPhysicalMemory)[1]', 'bigint')/1024 AS [TotalPhysicalMemory_MB],
x.value('(//Record/MemoryRecord/AvailablePhysicalMemory)[1]', 'bigint')/1024 AS [AvailablePhysicalMemory_MB],
x.value('(//Record/MemoryRecord/TotalPageFile)[1]', 'bigint')/1024 AS [TotalPageFile_MB],
x.value('(//Record/MemoryRecord/AvailablePageFile)[1]', 'bigint')/1024 AS [AvailablePageFile_MB],
x.value('(//Record/MemoryRecord/TotalVirtualAddressSpace)[1]', 'bigint')/1024 AS [TotalVirtualAddressSpace_MB],
x.value('(//Record/MemoryRecord/AvailableVirtualAddressSpace)[1]', 'bigint')/1024 AS [AvailableVirtualAddressSpace_MB],
x.value('(//Record/MemoryNode/@id)[1]', 'bigint') AS [Node Id],
x.value('(//Record/MemoryNode/ReservedMemory)[1]', 'bigint')/1024 AS [SQL_ReservedMemory_MB],
x.value('(//Record/MemoryNode/CommittedMemory)[1]', 'bigint')/1024 AS [SQL_CommittedMemory_MB],
x.value('(//Record/@id)[1]', 'bigint') AS [Record Id],
x.value('(//Record/@type)[1]', 'varchar(30)') AS [Type],
x.value('(//Record/ResourceMonitor/Indicators)[1]', 'bigint') AS [Indicators],
x.value('(//Record/@time)[1]', 'bigint') AS [Record Time]
FROM (SELECT CAST (record as xml) FROM sys.dm_os_ring_buffers
WHERE ring_buffer_type = 'RING_BUFFER_RESOURCE_MONITOR') AS R(x)) a
CROSS JOIN sys.dm_os_sys_info sys
ORDER BY 2 DESC-- a.[Record Time] ASC
Please see the Query results in the Excel attachment
Any help and insights are welcome!
I can't believe it's happening on a powerful server
I am reading the "Troubleshooting Performance Problems in SQL Server 2005" SQL Server Technical Article to find something... anything
November 25, 2008 at 11:55 am
Check the Total Working Sets counter in perfmon. See if the total working set of all processes decreases or if it remains steady when this happens.
What else is running on that server? You should not have min and max memory set the same. By doing that, SQL cannot release memory if the OS requests that memory be released. If that happens, the OS will have no choice but to page SQL's memory out to swap since the SQL engine is forbidden from releasing memory itself.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
November 25, 2008 at 1:15 pm
Thanks Gail
I think Total Working Sets dropped when it happened (_Total and sqlservr)
it was 7000000+, and now 1.3e+010 (sql: 1.2818e+010)
peaks were 2.2102e+010 and 2.0922e+010 respectively
I'll keep an eye on it and track it in SQLH2 to confirm it
but what would each option mean? [DECREASE vs STEADY]?
As for MIN/MAX memory, I wanted to change that in the first place
but being Production server I am hesitant, although MIN shouldn't affect it except in this paged out situation, correct?
November 25, 2008 at 1:32 pm
You can change min and max whilst your instance is online.
You'll see its actual impact after issuing a system checkpoint.
Johan
Learn to play, play to learn !
Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:
- How to post Performance Problems
- How to post data/code to get the best help[/url]
- How to prevent a sore throat after hours of presenting ppt
press F1 for solution, press shift+F1 for urgent solution 😀
Need a bit of Powershell? How about this
Who am I ? Sometimes this is me but most of the time this is me
November 25, 2008 at 2:13 pm
Let's see how it goes with MIN 5000 and MAX 17000
Initially it got even more frequent (with just MIN set). Then I decreased the MAX and it hasn't happened since for 10 mins
However sqlservr.exe memory is increasing steadily from 5GB to 7.5GB now, with not much activity (sp_who2 'active' returns 1~3 RUNNABLE only)
Process Explorer is showing ~10% CPU and 10GB Private Bytes for sqlservr.exe
25/11/2008 15:59:59 A significant part of sql server process memory has been paged out. This may result in a performance degradation. Duration: 328 seconds. Working set (KB): 4546508, committed (KB): 8973064, memory utilization: 50%.
25/11/2008 15:54:31 A significant part of sql server process memory has been paged out. This may result in a performance degradation. Duration: 0 seconds. Working set (KB): 5650908, committed (KB): 13963096, memory utilization: 40%.
25/11/2008 15:53:25 A significant part of sql server process memory has been paged out. This may result in a performance degradation. Duration: 0 seconds. Working set (KB): 8430000, committed (KB): 17292376, memory utilization: 48%.
25/11/2008 15:50:03 A significant part of sql server process memory has been paged out. This may result in a performance degradation. Duration: 328 seconds. Working set (KB): 8731856, committed (KB): 20386136, memory utilization: 42%.
25/11/2008 15:44:35 A significant part of sql server process memory has been paged out. This may result in a performance degradation. Duration: 0 seconds. Working set (KB): 231864, committed (KB): 20417944, memory utilization: 1%.
25/11/2008 15:31:28 A significant part of sql server process memory has been paged out. This may result in a performance degradation. Duration: 328 seconds. Working set (KB): 10109416, committed (KB): 20386200, memory utilization: 49%.
25/11/2008 15:26:00 A significant part of sql server process memory has been paged out. This may result in a performance degradation. Duration: 0 seconds. Working set (KB): 45444, committed (KB): 20408024, memory utilization: 0%.
25/11/2008 15:01:00 A significant part of sql server process memory has been paged out. This may result in a performance degradation. Duration: 331 seconds. Working set (KB): 6790740, committed (KB): 20385880, memory utilization: 33%.
November 26, 2008 at 7:26 am
Jerry Hung (11/25/2008)
Thanks GailI think Total Working Sets dropped when it happened (_Total and sqlservr)
it was 7000000+, and now 1.3e+010 (sql: 1.2818e+010)
That's not a decrease. 1.3e+010 is scientific notation for 13,000,000,000
I'll keep an eye on it and track it in SQLH2 to confirm it
but what would each option mean? [DECREASE vs STEADY]?
Steady means some other process is using working set and resulted in memory pressure. A sharp decease is usually indicative of a driver problem.
As for MIN/MAX memory, I wanted to change that in the first place
but being Production server I am hesitant, although MIN shouldn't affect it except in this paged out situation, correct?
Change it. It is not recommended that the min and the max be the same. By doing so you have effectively disabled some of SQL's memory management options.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
November 26, 2008 at 7:38 am
GilaMonster (11/26/2008)
Jerry Hung (11/25/2008)
Thanks GailI think Total Working Sets dropped when it happened (_Total and sqlservr)
it was 7000000+, and now 1.3e+010 (sql: 1.2818e+010)
That's not a decrease. 1.3e+010 is scientific notation for 13,000,000,000
Sorry, I meant sqlservr working set DROPPED significantly when the paged out happens (no surprise), say to 7000000
and then slowly climbing up (say to 1.3e+010), until the next problem
Sigh, I thought this could work, then more bad news this morning
Although it's happening "less" frequently
So I changed the MIN/MAX yesterday 4 pm, MemoryUtilization went to 92% woo-hoo!
Then didn't happen until 6:50 pm, and continues to happen until now
Also saw a big Buffer Cache Hit Ratio drop around 9 am (dropped from 99.xx to 28.xx)
Maybe we do need more memory?
Notification_timeNotification_typeMemoryUtilization %AvailablePhysicalMemory_MB
2008-11-26 09:10:56.713RESOURCE_MEMPHYSICAL_HIGH5217984
2008-11-26 09:10:52.713RESOURCE_MEM_STEADY5217360
2008-11-26 09:10:50.853RESOURCE_MEMPHYSICAL_LOW4416927
2008-11-26 09:09:50.290RESOURCE_MEM_STEADY718433
2008-11-26 09:09:45.230RESOURCE_MEMPHYSICAL_LOW216477
2008-11-26 06:30:01.253RESOURCE_MEM_STEADY8915013
2008-11-26 06:30:00.893RESOURCE_MEMPHYSICAL_LOW8915002
2008-11-26 06:09:40.810RESOURCE_MEM_STEADY2421211
2008-11-26 06:09:35.603RESOURCE_MEMPHYSICAL_LOW2421077
2008-11-26 06:08:35.043RESOURCE_MEM_STEADY2021052
2008-11-26 06:08:29.153RESOURCE_MEMPHYSICAL_LOW1719936
2008-11-26 06:07:28.590RESOURCE_MEM_STEADY718878
2008-11-26 06:07:23.593RESOURCE_MEMPHYSICAL_LOW016961
2008-11-26 04:50:09.730RESOURCE_MEM_STEADY2217372
2008-11-26 04:50:04.183RESOURCE_MEMPHYSICAL_LOW516132
2008-11-26 04:00:01.300RESOURCE_MEM_STEADY9215069
2008-11-26 04:00:01.300RESOURCE_MEMPHYSICAL_HIGH9215069
2008-11-26 04:00:00.503RESOURCE_MEMPHYSICAL_LOW9215052
2008-11-26 03:42:46.303RESOURCE_MEMPHYSICAL_HIGH5119690
2008-11-26 03:42:45.313RESOURCE_MEM_STEADY5119696
2008-11-26 03:42:40.430RESOURCE_MEMPHYSICAL_LOW4818966
2008-11-26 03:41:39.867RESOURCE_MEM_STEADY4518002
2008-11-26 03:41:34.537RESOURCE_MEMPHYSICAL_LOW3817432
2008-11-26 03:40:33.990RESOURCE_MEM_STEADY2117389
2008-11-26 03:40:28.900RESOURCE_MEMPHYSICAL_LOW416890
2008-11-26 02:19:13.633RESOURCE_MEM_STEADY4120224
2008-11-26 02:19:08.350RESOURCE_MEMPHYSICAL_LOW4519796
2008-11-26 02:18:07.787RESOURCE_MEM_STEADY4020267
2008-11-26 02:18:02.773RESOURCE_MEMPHYSICAL_LOW4619622
2008-11-26 02:17:02.210RESOURCE_MEM_STEADY4020308
2008-11-26 02:16:57.193RESOURCE_MEMPHYSICAL_LOW4419905
2008-11-26 02:15:56.633RESOURCE_MEM_STEADY3920348
2008-11-26 02:15:51.553RESOURCE_MEMPHYSICAL_LOW3719907
2008-11-26 02:14:50.993RESOURCE_MEM_STEADY821223
2008-11-26 02:14:45.977RESOURCE_MEMPHYSICAL_LOW1419435
2008-11-26 02:13:45.413RESOURCE_MEM_STEADY618424
2008-11-26 02:13:39.803RESOURCE_MEMPHYSICAL_LOW116281
2008-11-25 22:00:03.523RESOURCE_MEMPHYSICAL_HIGH7017802
2008-11-25 21:59:58.523RESOURCE_MEM_STEADY6218369
2008-11-25 21:59:58.257RESOURCE_MEMPHYSICAL_LOW6218278
2008-11-25 21:54:35.150RESOURCE_MEM_STEADY2920815
2008-11-25 21:54:30.117RESOURCE_MEMPHYSICAL_LOW3220577
2008-11-25 21:53:29.553RESOURCE_MEM_STEADY2820434
2008-11-25 21:53:24.540RESOURCE_MEMPHYSICAL_LOW2519538
2008-11-25 21:52:23.977RESOURCE_MEM_STEADY1818616
2008-11-25 21:52:18.930RESOURCE_MEMPHYSICAL_LOW1717104
2008-11-25 21:51:18.370RESOURCE_MEM_STEADY1016659
2008-11-25 21:51:13.010RESOURCE_MEMPHYSICAL_LOW115178
2008-11-25 19:22:50.347RESOURCE_MEM_STEADY4813499
2008-11-25 19:22:45.333RESOURCE_MEMPHYSICAL_LOW4911664
2008-11-25 19:21:44.770RESOURCE_MEM_STEADY710708
2008-11-25 19:21:39.370RESOURCE_MEMPHYSICAL_LOW08528
2008-11-25 19:16:55.880RESOURCE_MEM_STEADY4811729
2008-11-25 19:16:50.863RESOURCE_MEMPHYSICAL_LOW449250
2008-11-25 19:15:50.303RESOURCE_MEM_STEADY359860
2008-11-25 19:15:45.177RESOURCE_MEMPHYSICAL_LOW308194
2008-11-25 19:14:44.613RESOURCE_MEM_STEADY510870
2008-11-25 19:14:39.430RESOURCE_MEMPHYSICAL_LOW18476
2008-11-25 18:52:01.823RESOURCE_MEM_STEADY4918998
2008-11-25 18:51:56.790RESOURCE_MEMPHYSICAL_LOW5018615
2008-11-25 18:50:56.230RESOURCE_MEM_STEADY4319240
2008-11-25 18:50:51.150RESOURCE_MEMPHYSICAL_LOW4418066
2008-11-25 18:49:50.590RESOURCE_MEM_STEADY1319490
2008-11-25 18:49:45.167RESOURCE_MEMPHYSICAL_LOW217717
2008-11-25 18:25:01.397RESOURCE_MEMPHYSICAL_HIGH9212214
2008-11-25 18:25:01.393RESOURCE_MEM_STEADY9212214
2008-11-25 18:25:00.893RESOURCE_MEMPHYSICAL_LOW9212211
2008-11-25 16:00:00.893RESOURCE_MEM_STEADY5117974
2008-11-25 16:00:00.893RESOURCE_MEMPHYSICAL_HIGH5117974
Working Set of sqlserv (_Total shows the same pattern)
November 26, 2008 at 7:58 am
Hi,
Are you saying that you have around 10 GB Available MB in the OS at the time when the SQL Server working set gets trimmed? That's very odd.
Make sure you are not running into any of the known problems; Bob Dorr has written a blog post which dissects this error and also lists the fixes you want to make sure you have.
http://blogs.msdn.com/psssql/archive/2007/05/31/the-sql-server-working-set-message.aspx.
Make sure you don't have the "Lock Pages in Memory" user right set for the SQL Server startup account.
/Elisabeth
elisabeth@sqlserverland.com
MCITP | MCT
http://sqlblog.com/blogs/elisabeth_redei/
http://linkedin.com/in/elisabethredei
November 26, 2008 at 7:59 am
Hi,
Are you saying that you have around 10 GB Available MB in the OS at the time when the SQL Server working set gets trimmed? That's very odd.
Make sure you are not running into any of the known problems; Bob Dorr has written a blog post which dissects this error and also lists the fixes you want to make sure you have.
http://blogs.msdn.com/psssql/archive/2007/05/31/the-sql-server-working-set-message.aspx.
Make sure you don't have the "Lock Pages in Memory" user right set for the SQL Server startup account.
/Elisabeth
elisabeth@sqlserverland.com
MCITP | MCT
http://sqlblog.com/blogs/elisabeth_redei/
http://linkedin.com/in/elisabethredei
November 26, 2008 at 8:07 am
I'm not asking about SQL's working set. The message says that's getting reduced. I asked about the total working set of all processes on the server. If that's showing a steep drop, then you have a driver issue.
There's a bug in windows where some driver memory requests can lead to a sharp trim in all process' working sets. I'll try and find a reference for you.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
November 26, 2008 at 8:12 am
Elisabeth Rédei (11/26/2008)
Make sure you don't have the "Lock Pages in Memory" user right set for the SQL Server startup account.
Standard edition ignores that setting.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
November 26, 2008 at 8:13 am
I know about the Broadcom Gigabit NIC driver, and it's updated as well (I'll re-check it)
We have
2 Broadcom BCM5708C NetXtreme II GigE NIC
2 Intel PRO/1000 PT Dual Port
and this "Primary Team" NIC.... which I think it's teaming for 2 NIC's but I don't know if SQL is using that
And being a Standard Edition, I don't think "Lock Pages In Memory" comes into effect here
I'll read that PSS article....
Pictures are worth thousand words
November 26, 2008 at 8:18 am
Yeah, that looks like the 'falling off a cliff' behaviour that Bob Ward described. I'll hunt for an article and drop him a mail if I can't find it.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
November 26, 2008 at 10:02 am
Just heard back and that does look like a driver issue. The Broadcom is known to give issues, no ideas about the others. MS won't write articles about other people's drivers (for good reason). Maybe search around see if you can find any other people with the same issue with those drivers.
Unfortunately this is not a SQL problem, it's a windows issue, and nothing can be done on the SQL side. I hear that this is fixed in Windows Server 2008, so if upgrading is an option, you may suggest that to the bosses.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
November 26, 2008 at 11:28 am
Thanks for the help Gail
Yeah, "fortunately" we are moving away from this server (it's being migrated)
Otherwise I don't know how easy it is to convince others to go on a "bug hunt" on a Production server
2 Broadcom NICs are teamed together, with driver from Feb 2008 (new enough I'd think)
Windows Update does want to install 39 High-priority updates..... wonder if these will magically fix the problem 😛
Viewing 15 posts - 1 through 15 (of 19 total)
You must be logged in to reply to this topic. Login to reply