February 14, 2007 at 2:32 pm
Hello,
Yesterday we started having problems with a sql 2000 cluster. The server slowed down - I was able to get into EM for a while and look at logs, then our Senior decided to fail it over. In the sql log we found:
AWE mapping status:
There are 331696 pages in the AWE window
329827 pages have a bstat of 000000
1338 pages have a bstat of 000009
BPool::Map: no remappable address found.
Buffer Distribution: Stolen=329755 Free=289 Procedures=84
Inram=0 Dirty=20382 Kept=0
I/O=1, Latched=841, Other=432049
Buffer Counts: Commited=783401 Target=783401 Hashed=453050
InternalReservation=1351 ExternalReservation=57 Min Free=512 Visible= 331696
--------------------------
I haven't found too much info on this BPool error - most advice say to call PSS, which we don't have(although we could probably get a manager to cough up some cash for a call). A few of the posts I read said to run checks on RAID hardware and disk, which our server team has done and says it came up all clear.
This morning we've been having slow downs in response - we had to fail over again and rebooted one node, and then failed back over to it. We found an 'insufficient memory' error in the event log for mssql. One of the admins said that SQL was eating up all the memory. Now it says it's using 100mb in the task manager - that doesn't seem like it's eating to me. The mssql:memory\total server memory is at it's maximum defined memory (which was set to around 5 gigs - we have 6.75 gigs on the box). the working set for the sqlsrv process is at 120mb.
I've been watching the performance counters and they seem fine to me. We're kinda stuck on what to do and would appreciate advice.
February 14, 2007 at 2:39 pm
more notes:
AWE is enabled in sql
PAE and 3GB switches are on in boot.ini
Some numbers:
Target Server Memory-kb:6270473
Available MBytes: 191
Pages/sec:.051
Free Pages: 54793
Buffer cache hit ratio: 99.8
Page Faults /sec: 213
Working set(sqlserver): 127750000
Disk Reads/sec: 10
Disk Writes/sec: 20
February 15, 2007 at 1:36 am
- If it all started all of a sudden, I would begin with starting a sql trace for e.g. one hour. Then examine the results and search for big consumers.
- how often do you run rebuildindexes, sp_updatestats ?
- did someone perform major cleanups. That can result in more scans then needed. perform db-maintenance and update statistics to renormalize things.
Johan
Learn to play, play to learn !
Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:
- How to post Performance Problems
- How to post data/code to get the best help[/url]
- How to prevent a sore throat after hours of presenting ppt
press F1 for solution, press shift+F1 for urgent solution 😀
Need a bit of Powershell? How about this
Who am I ? Sometimes this is me but most of the time this is me
February 15, 2007 at 9:00 am
Just a few more diagnostic questions to ask:
RegardsRudy KomacsarSenior Database Administrator"Ave Caesar! - Morituri te salutamus."
February 15, 2007 at 9:18 am
Thanks for your help.
We've found several issues:
Our disks are heavily fragmented
There was a long running query which coincided with a few of the slow downs
The page file is set to 4.5GB on 2 disks. Windows recommends 10GB (6.5GB of memory).
Paging and locking was up to moderate levels afer we failed over, but I think this is due to pent up demand after is was down. You opinion?
I'll look into your suggestions.
Your help is appreciated very much!
February 16, 2007 at 12:48 am
- when paging occurs, your server will slow down horribly !
- find out why it is paging because you should avoid it !
- If your db are to the optimum, then your acces may be optimal and memory consumption may be optimal too.
Johan
Learn to play, play to learn !
Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:
- How to post Performance Problems
- How to post data/code to get the best help[/url]
- How to prevent a sore throat after hours of presenting ppt
press F1 for solution, press shift+F1 for urgent solution 😀
Need a bit of Powershell? How about this
Who am I ? Sometimes this is me but most of the time this is me
February 16, 2007 at 6:14 am
you generally don't need to increase the page file size on a sql server as sql server is designed not to page. You might try removing the /3gb switch, I've found this can give problems at times because it limits lower memory.
One of my key questions for all DBA's and Sysadmins is to know how to see how much memory an app is using when it's using awe. I will reject any DBA applicant who doesn't know the answer. ( It's well documented at Microsoft ) With awe you CANNOT use task manager to view the memory being used.
It's a bit difficult to make any other suggestions - I see the "reboot fixes everything" mentality is still alive and well then < grin >
[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/
February 16, 2007 at 6:39 am
I've seen too many things fixed by a reboot not to give it a try when Im stumped!
February 16, 2007 at 8:45 am
I've read that Total Server Memory is the amount that sql server has reserved and that it will hold on to that amount until another app requests it. Can you tell how much is actually being used within that address space?
February 16, 2007 at 12:08 pm
this is not true with awe. as far as I am sure , memeory is not swapped out with awe, once taken it's kept. Again this is documented in ms kb. You can read the memory from sysperfinfo
select counter_name ,cntr_value,cast((cntr_value/1024.0)/1024.0 as numeric(8,2)) as Gb
from sysperfinfo where counter_name like '%server_memory%'
[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/
February 19, 2007 at 7:20 am
Sam,
Not sure if you've resolved this, and not sure of your service pack level is, but there are a number of issues for AWE. Might want to look at this link: http://support.microsoft.com/kb/831999, plus confirm that you're up to sp4 with the additional AWE hotfix.
HTH
Rgds iwg
February 19, 2007 at 7:42 am
Ok Guys.
The problem that you were facing was not because of the paging but because there the server was under the virtual memory pressure. One of my sugestions is to remove the /3Gb switch. It is always better to not use this if you are using the AWE. make sure that you are on latest service pack (sp4 + rollup patch).
Also to check how much memory is used by SQL server if it is running under the AWE you need to check using the DBCC MEMORYSTATUS comamnd.
http://support.microsoft.com/kb/271624/
But I bet removing the /3GB will help. Also make sure that the indexes are rebuilt periodically and the statistics are updated. This helps in preventing the bad plans which mght eat up some memory as well.
looking at the output which you had pasted in notes the stolen count was around 2.5 GB meaning that there was some command which was eating up lots of memory.
The hashed was around 3.5 GB meaning that there must be a large query running that is required to bring a lot of pages into memory.
try and find out if there was such a memory intensive query that you can identify and try tunning it.
And last , but not the least, call up PSS. They can defintately help you ( I was a part for it some months ago)
February 19, 2007 at 7:54 am
regarding the /3Gb setting I use the guidelines published at http://www.sql-server-performance.com/awe_memory.asp
Johan
Learn to play, play to learn !
Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:
- How to post Performance Problems
- How to post data/code to get the best help[/url]
- How to prevent a sore throat after hours of presenting ppt
press F1 for solution, press shift+F1 for urgent solution 😀
Need a bit of Powershell? How about this
Who am I ? Sometimes this is me but most of the time this is me
February 19, 2007 at 4:28 pm
iwg - yes, our sp's and hotfixes are up to date
vineet - we identified a new query which was run in production on the days of the problem. We are thinking this is a contributing factor. We have a few slow queries that are going to be tuned within the next few weeks.
ALZDBA - Yes, I read over that and it seemed like we were set up correctly.
Another 6gb of memory has been added and sql has taken advantage of it on friday, a very light processing day.
We'll see how things go tomorrow - today is a day off, although we do have some online systems which are 24/7.
I'll post our results end of day tomorrow.
As always - thanks for your help everyone.
sam
February 19, 2007 at 5:13 pm
You said there was a correlation between a long running query and the slow down... fix the QUERY. Long running queries are usually IO and CPU hogs. Fix it before you start messing around with the server.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 1 through 15 (of 19 total)
You must be logged in to reply to this topic. Login to reply