February 21, 2017 at 9:46 am
Hello All,
Our database server is currently having a strange memory issue.
Our database is about 800G, memory is 120G, and it is a very busy . Before, our buffer cache can went up and stayed around 90G, and PLE can be up to 3000 minutes.
Started from last week, the buffer cache went down between 16G to 40G, PLE went to only 40 minutes, all the others are free pages in sql memory
Bellow is a graph on spotlight shows current memory status, about 43G is buffer cache, but the free pages(yellow line) is 65G.
My question is : Why free pages is so high while database buffer cache and PLE is very low?
One thing had changed since last is I started a daily job to move some some data to archive database then delete those data from production, the job is run at midnight, and runs for about 2 hours every day. Could that be the problem and why?
Please help!!!
February 21, 2017 at 10:00 am
1) You posted this on SQL 2012 forum, but I thought Free Pages went away in that version
2) How is your IO amount and performance compared to before this started happening?
3) Are users complaining or things not happening in an acceptable (or roughly the same) time?
4) Are you on a virtualized server?
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
February 21, 2017 at 10:50 am
Hi Kevin,
Thanks for the quick reply. To answer your question:
1) You posted this on SQL 2012 forum, but I thought Free Pages went away in that version
-----I think you are right. Then question is: the resource manager shows our 128G memory is all full, 6G is for OS, Buffer cache is 44.6G, procedure cache is 7.7G, what does the rest 76G used for?
2) How is your IO amount and performance compared to before this started happening?
--- Our disk IO wait time is much higher than before.
3) Are users complaining or things not happening in an acceptable (or roughly the same) time?
--- Yes, web pages ars slower, more slow queries in our log. And when the buffer was only 16G last Friday, our applications complains have many queries timed out.
4) Are you on a virtualized server?
--- We are on a physical server with 16 CPU and 128G memory
February 21, 2017 at 11:07 am
graciez - Tuesday, February 21, 2017 10:50 AMHi Kevin,Thanks for the quick reply. To answer your question:
1) You posted this on SQL 2012 forum, but I thought Free Pages went away in that version
-----I think you are right. Then question is: the resource manager shows our 128G memory is all full, 6G is for OS, Buffer cache is 44.6G, procedure cache is 7.7G, what does the rest 76G used for?2) How is your IO amount and performance compared to before this started happening?
--- Our disk IO wait time is much higher than before.3) Are users complaining or things not happening in an acceptable (or roughly the same) time?
--- Yes, web pages ars slower, more slow queries in our log. And when the buffer was only 16G last Friday, our applications complains have many queries timed out.4) Are you on a virtualized server?
--- We are on a physical server with 16 CPU and 128G memory
A) I have no idea where your numbers are coming from. Try out dbcc memorystatus. You can find documentation online about it's output. It is helpful to SET NOCOUNT ON first and output to text.
B) What changed? ANYTHING other than your moving data to an archive database? Do users access that at all? If so it is properly indexed? That's a common mistake I see...
C) What else is running on the server and how much memory are those things taking?
D) What build are you on? SQL Server 2012 got a completely rewritten memory management system and there were numerous bugs that were later patched.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
February 21, 2017 at 11:23 am
Thanks Kevin,
A) DBCC MEMORYSTATUS:
Process/System Counts Value
---------------------------------------- --------------------
Available Physical Memory 373526528
Available Virtual Memory 8209852002304
Available Paging File 130824531968
Working Set 131080753152
Percent of Committed Memory in WS 100
Page Faults 2170864460
System physical memory high 1
System physical memory low 0
Process physical memory low 0
Process virtual memory low 0
Memory Manager KB
---------------------------------------- -----------
VM Reserved 571738128
VM Committed 127745612
Locked Pages Allocated 0
Large Pages Allocated 0
Emergency Memory 1024
Emergency Memory In Use 16
Target Committed 127748472
Current Committed 127745616
Pages Allocated 58964568
Pages Reserved 0
Pages Free 63667928
Pages In Use 15877400
Page Alloc Potential 105652560
NUMA Growth Phase 2
Last OOM Factor 0
Last OS Error 0
Memory node Id = 0 KB
---------------------------------------- -----------
VM Reserved 571730960
VM Committed 63874220
Locked Pages Allocated 0
Pages Allocated 45770952
Pages Free 14661696
Target Committed 63874224
Current Committed 63874224
Foreign Committed 1435912
Away Committed 0
Taken Away Committed 0
Memory node Id = 1 KB
---------------------------------------- -----------
VM Reserved 7104
VM Committed 63871372
Locked Pages Allocated 0
Pages Allocated 13193624
Pages Free 49006232
Target Committed 63874224
Current Committed 63871376
Foreign Committed 495180
Away Committed 0
Taken Away Committed 0
B) What changed? ANYTHING other than your moving data to an archive database? Do users access that at all? If so it is properly indexed? That's a common mistake I see...
----No, only the moving data script. Users barley read the archived data and archived data is also indexed link production.
C) What else is running on the server and how much memory are those things taking?
--- Only SQL server is runninng on the server.
D) What build are you on? SQL Server 2012 got a completely rewritten memory management system and there were numerous bugs that were later patched.
Our version is :
--- Enterprise Edition: Core-based Licensing (64-bit) on Windows NT 6.1 <X64> (Build 7601: Service Pack 1)
February 21, 2017 at 11:47 am
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
February 21, 2017 at 11:54 am
There's a lot more to dbcc memorystatus than what you posted. What did your review of the documentation online reveal?
Also review this:
And go get Glenn Berry's SQL Server Diagnostic Scripts. Lots of Awesomesauce in it.
BTW, you didn't give the build of sql server. You gave the build of the OS. And I will never understand people paying for Enterprise Edition of SQL Server and hobbling it with a paltry 128GB of RAM.
Have you tried rebooting the server? If you have a memory leak (there were several identified in SQL 2012 over the years) that should get you functional again until you can test out upgrading to the most recent build.
Given that this has been going on for days and significant performance problems exist (including user queries timing out), I STRONGLY recommend you hire a performance tuning consultant for a quick review, especially if a reboot didn't help or you cannot do one soon.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
February 21, 2017 at 11:57 am
The archiving is something new. It moves one days of data from production to archive db to keep production only contain 180 days data.
After archiving job completed every night, I run sp_updatestats to update statistics, should that fix the statistics? If not, what can I do with it?
Also, most of our queries is query most recent data(few weeks the most), they all have date range in where clause.
I don't understand who archiving old data can affect low memory usage?
February 21, 2017 at 12:06 pm
OK, so sp_updatestats uses sampling to create the statistics? Do you also do index rebuilds at any point? These create statistics using full scan. You may just be putting your stats out of date using sp_updatestats. Hard to know for sure, but this is a possible culprit, and it's the one thing that you're sure you've changed. If the stats are out of date or incorrect, you can get plans that allocate more memory than you use. Also, since we're talking 2012, you don't have access to the new cardinality estimation engine. This means that new rows that are added to an index fall outside the statistics. In 2012 and prior, the optimizer assumes one row. That usually results in under-, not over-estimates of memory, but this could be adding to your problems.
You need to determine if your execution plans have changed because of the changes you've introduced in the system.
Also, if you're deleting a lot of data, that could be flushing the cache and certainly could be killing PLE (which is a very secondary measurement, don't get hooked on worrying about it). Although, if most of the queries are only after recent data and you're moving old data, this may be a minimal impact. However, it's still likely.
When system behavior changes, and you've changed a process, that's almost always the culprit in one way or another.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
February 21, 2017 at 12:10 pm
TheSQLGuru - Tuesday, February 21, 2017 11:54 AMThere's a lot more to dbcc memorystatus than what you posted. What did your review of the documentation online reveal?Also review this:
And go get Glenn Berry's SQL Server Diagnostic Scripts. Lots of Awesomesauce in it.
BTW, you didn't give the build of sql server. You gave the build of the OS. And I will never understand people paying for Enterprise Edition of SQL Server and hobbling it with a paltry 128GB of RAM.
Have you tried rebooting the server? If you have a memory leak (there were several identified in SQL 2012 over the years) that should get you functional again until you can test out upgrading to the most recent build.
Given that this has been going on for days and significant performance problems exist (including user queries timing out), I STRONGLY recommend you hire a performance tuning consultant for a quick review, especially if a reboot didn't help or you cannot do one soon.
thanks for these information, I will read the document.
Is this build of sql server?
Microsoft SQL Server 2012 (SP2-CU9) (KB3098512) - 11.0.5641.0 (X64)Oct 20 2015 16:40:19Copyright (c) Microsoft CorporationEnterprise Edition: Core-based Licensing (64-bit) on Windows NT 6.1 <X64> (Build 7601: Service Pack 1)We just move to this server on March 2016. Do you think a reboot is needed?
I had stopped the archiving job and check if buffer cache can go up again.
February 21, 2017 at 12:50 pm
That is the SQL Server build number. You are 14 months behind as CU7 for SP3, build 6579, was released Jan 17, 2017. It is absolutely possible that you have memory (and/or other) issues simply because you are not patched up.
I cannot see that running sp_updatestats will get you out of date stats. It could make them a bit less accurate than the full scan you would get on index-based stats if you happen to do index rebuilds regularly. But in my experience it would be truly exceptional for this to be causing a signficant problem.
It also sounds like you are having performance problems regularly, not just after you do the batch delete, right? If so then if they were the sole cause I would definitely expect your system to bounce back pretty quickly as needed pages were lifted back up into memory as they were accessed.
As Grant says it is still likely that your issues are somehow due to the new process. But it may not be the actual movement that is at fault. Maybe something weird about your table/index structure, like you accidentally put your FILLFACTOR at 1. :w00t:
DOH!! It just dawned on me that the picture in your initial post indicates that you almost certainly have a monitoring system in place! How do all relevant indicators track in the days/weeks leading up to your new process with how they track after it? Also, are there significant variations at any point since the change was made?
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
February 23, 2017 at 12:47 pm
If you can get past the crisis, you may want to look at a good solid jump in the overall RAM on this box. You stated you have an 800 GB database and that it's fairly busy. That just seems extremely under-powered. It may turn out that you were on the edge of a resource shortage, and the change in process just pushed you over that edge. You should probably also get folks to consider some kind of schedule of regular re-boots. If you can get current on your SQL Server version, that may become less important, but until you do, it might be a good shot in the arm. This problem is just kind of "screaming in my ear" that there's a lack of RAM, and there may be multiple culprits. Going over a resource cliff can be a bit of a "rude awakening", and suggests that even if you ultimately determine that you need to bump up the RAM significantly, you may not have the kind of monitoring in place that can alert you to an approaching edge for resource consumption. I wish I had a useful monitoring recommendation, but the best I can do there is echo Grant Fritchey, as he's a RedGate product evangelist, and I've heard a lot of good things about their tools.
Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)
February 23, 2017 at 1:11 pm
Thank you! We are actually planning move to a more power server.
I had stopped the archive script for 2 days but buffer cache did not go up. It stays around 40G, 2 weeks ago, it was around 60 to 80G.
Another guess, cause we don't have any changes on applications or database schema/data, could it be a hard ware issue? Memory fragmentation?
February 27, 2017 at 2:49 pm
graciez - Thursday, February 23, 2017 1:11 PMThank you! We are actually planning move to a more power server.
I had stopped the archive script for 2 days but buffer cache did not go up. It stays around 40G, 2 weeks ago, it was around 60 to 80G.
Another guess, cause we don't have any changes on applications or database schema/data, could it be a hard ware issue? Memory fragmentation?
So you're saying that you stop running the script and everything is pretty much fine? In my mind, that says resource cliff. Sit on the edge and be careful, and you're fine, but lean too far over to try and see the bottom of the canyon, and you fall over it...
Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)
February 27, 2017 at 6:11 pm
graciez - Thursday, February 23, 2017 1:11 PMThank you! We are actually planning move to a more power server.
I had stopped the archive script for 2 days but buffer cache did not go up. It stays around 40G, 2 weeks ago, it was around 60 to 80G.
Another guess, cause we don't have any changes on applications or database schema/data, could it be a hard ware issue? Memory fragmentation?
I am sorry to say this, but it is very unlikely that we can do anything more for you via a forum graciez. There are just too many things that could be at play here. I do not however, believe that memory fragmentation would be one of them though.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
Viewing 15 posts - 1 through 15 (of 15 total)
You must be logged in to reply to this topic. Login to reply