Re: Intermittent slow database resposne

Question

Post reply

Re: Intermittent slow database resposne

Viewing 13 posts - 16 through 27 (of 27 total)

You must be logged in to reply to this topic. Login to reply

Sam Greene SSCertifiable Points: 5252 More actions · Answer 1

Just for my own information, why did you choose 7168 as the maximum server memory?

7168MB / 1024 = 7GB

You should have at least 1GB for the OS. Gila recommended 1.5.

Any interesting on the performance counters?

jlp3630 Hall of Fame Points: 3243 More actions · Answer 2

DavidB (8/27/2008)
Very sorry about that. The link I was referring to was http://support.microsoft.com/kb/918483.
Not so sure that applies based on the errors that you provided.
Have you installed the performance dashboard as recommended by an earlier response? I would be interested in seeing what you are seeing as your primary wait types.
select * from sys.dm_os_wait_stats order by wait_time_ms desc

I logged into the server this morning, and everything seems to be running ok. In other words, SQL server queries are running fast, and RDC is running at normal speed consistently. I didn't do anything to fix the problem, so I guess it "magically" fixed itself.

Here's a snapshot summary of the current task manager/performance tab:

PF Usage: 8.06GB

Physical Memory (K)

Total: 8383116

Available: 208096

System Cache: 676940

Commit Charge (K)

Total: 8458284

Limit: 11761148

Peak: 11022620

Here's the current wait times from the performance dashboard:

Wait CategoryNumber of WaitsWait Time (sec)% Wait Time

Sleep92258898132896.68651.79%

Parallelism3676062657328.21822.34%

Other20325494333374.99413.01%

Buffer IO 255845424789.5619.66%

Logging30226372843.078 1.11%

Lock843282281.701 0.89%

Latch8692461280.718 0.50%

Scheduler Yield163141311243.343 0.48%

Network IO26328362.1870.14%

Buffer Latch3763003180.3280.07%

Memory50611.6250.00%

Here's the currents results from sys.dm_os_wait_stats:

wait_typewaiting_tasks_countwait_time_msmax_wait_time_mssignal_wait_time_ms

SQLTRACE_BUFFER_FLUSH16216648605784281546

LAZYWRITER_SLEEP668716482268712963078

CXPACKET36751505573036875323901913500

PAGEIOLATCH_SH2515588237599684015132578

BROKER_TASK_STOP27641369584310015500

BROKER_RECEIVE_WAITFOR36748745360000015

IO_COMPLETION5978815658250240610203

WRITELOG293624526696714437116312

OLEDB2012845441756562526870

LCK_M_U830121738359200002328

BACKUPIO1142998162328120001500

LATCH_EX8688841280375906186546

SOS_SCHEDULER_YIELD1604464912371877501234000

SLEEP_TASK9209496811910934371142906

ASYNC_IO_COMPLETION10011475933497180

BACKUPBUFFER22241811168909531890

PAGEIOLATCH_EX395699049372421203

BACKUPTHREAD99981479636446815

Sam Greene SSCertifiable Points: 5252 More actions · Answer 3

I logged into the server this morning, and everything seems to be running ok. In other words, SQL server queries are running fast, and RDC is running at normal speed consistently. I didn't do anything to fix the problem, so I guess it "magically" fixed itself.

Sounds like a good time to dig into looking at memory and disk performance.

Also, what is the wait in the parallelism?

jlp3630 Hall of Fame Points: 3243 More actions · Answer 4

sam (8/27/2008)
I logged into the server this morning, and everything seems to be running ok. In other words, SQL server queries are running fast, and RDC is running at normal speed consistently. I didn't do anything to fix the problem, so I guess it "magically" fixed itself.
Sounds like a good time to dig into looking at memory and disk performance.
Also, what is the wait in the parallelism?

Here's the info on parallelism:

Wait TypeNumber of WaitsWait Time (sec)% Wait timeMax Wait Time (ms)Avg Wait Time (ms)

CXPACKET3684792357496.859100.00%5323901.6

jlp3630 Hall of Fame Points: 3243 More actions · Answer 5

sam (8/26/2008)
http://www.google.com/search?q=troubleshooting+sql+memory+problems&sourceid=ie7&rls=com.microsoft:en-US&ie=utf8&oe=utf8
The first link has a few counters you could look at.

Is this the website that you were referring to regarding performance counters:

http://sqlserver-qa.net/blogs/perftune/archive/2007/06/19/memory-issues.aspx

Thanks again for all your input.

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 6

jlp3630 (8/27/2008)

I'm running RAID-0 on my c: drive (contains Windows, SQL server, master, model and msdb databases and log files).
I'm running RAID-50 on my f: drive (tempdb, user databases and log files).

RAID 50? That's the first time I've heard of anyone using that Raid level.

Just curious, why RAID 50?

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

jlp3630 Hall of Fame Points: 3243 More actions · Answer 7

GilaMonster (8/27/2008)

jlp3630 (8/27/2008)
I'm running RAID-0 on my c: drive (contains Windows, SQL server, master, model and msdb databases and log files).
I'm running RAID-50 on my f: drive (tempdb, user databases and log files).
RAID 50? That's the first time I've heard of anyone using that Raid level.
Just curious, why RAID 50?

The server was setup before I started working for this client. I have no idea why it's setup that way. When I find out, I'll let you know.

Also, my c: drive is RAID-1 not RAID-0.

Cliff Jones SSChampion Points: 10547 More actions · Answer 8

On our 64 bit servers we set the memory to allow 1.5 - 2 GB for the OS as Gail recommended and set Maxdop at 1 or 2. If you leave Maxdop at 0 or set it to some higher value (than 1) then set the cost threshold up to a higher value (10 or more).

It looks like parallelism might be a problem based on your waitstats (did you reset them?).

jlp3630 Hall of Fame Points: 3243 More actions · Answer 9

Cliff Jones (8/28/2008)
On our 64 bit servers we set the memory to allow 1.5 - 2 GB for the OS as Gail recommended and set Maxdop at 1 or 2. If you leave Maxdop at 0 or set it to some higher value (than 1) then set the cost threshold up to a higher value (10 or more).
It looks like parallelism might be a problem based on your waitstats (did you reset them?).

Cliff,

Just for my own knowledge, could you explain the logic behind these two suggestions:

1. set Maxdop to 1 or 2

2. maxdop at 0 or higher than 1 & cost threshold > 10

I didn't reset my wait stats when I posted these sys.dm_os_wait_stats results:

wait_type waiting_tasks_count wait_time_ms max_wait_time_ms signal_wait_time_ms

SQLTRACE_BUFFER_FLUSH 16216 64860578 4281 546

LAZYWRITER_SLEEP 66871 64822687 1296 3078

CXPACKET 36751505 57303687 532390 1913500

PAGEIOLATCH_SH 2515588 23759968 4015 132578

BROKER_TASK_STOP 2764 13695843 10015 500

BROKER_RECEIVE_WAITFOR 36 7487453 600000 15

IO_COMPLETION 597881 5658250 2406 10203

I just reset them for further monitoring.

Thanks for your input.

Jon

Cliff Jones SSChampion Points: 10547 More actions · Answer 10

We have found on our 64 bit servers (especially our Itanium) that if we leave Maxdop at 0, the server spends a lot of CPU time transferring the load from one CPU to another (context switching). We actually achieved better performance by setting Maxdop to 1 (which was actually what our HP vendor reccommended).

On some of our other non-Itanium servers I found that I was able to home in on a proper value for Maxdop by tinkering with the Maxdop and the threshold values during times of high CPU utilization to home in on a better value. I did this by resetting the waitstats, let the server run for an hour or so and then rechecking the CXPacket waits and reducing the signal wait percentage. Start at Maxdop 1 for a benchmark, then Maxdop 2. At Maxdop = 2 increase the threshold to see if you can improve on the baseline.

Did you install SQL 2005 from a fresh install (rather than an in place upgrade)?

Cliff Jones SSChampion Points: 10547 More actions · Answer 11

I don't remember where I got this but here is the query I use to get the Signal Wait percentages:

create table #waitstats (

waittype varchar(80),

requests numeric(20,1),

waittime numeric (20,1),

signalwaittime numeric(20,1))

insert into

#waitstats (waittype, requests, waittime, signalwaittime)

select

wait_type, waiting_tasks_count, wait_time_ms, signal_wait_time_ms

from

sys.dm_os_wait_stats

declare @totalwait numeric(20,1),

@totalsignalwait numeric(20,1) ,

@endtime datetime,

@begintime datetime

--get the totals

select

@totalwait=sum(waittime), @totalsignalwait=sum(signalwaittime)

from

#waitstats

--- subtract waitfor, sleep, and resource_queue from Total

select

@totalwait = @totalwait - sum(waittime), @totalsignalwait = @totalsignalwait - sum(signalwaittime)

from

#waitstats

where

waittype in ('Waitfor','Sleep','Resource_Queue')

-- insert adjusted totals, rank by percentage descending

insert into #waitstats select '***total***', 0, @totalwait, @totalsignalwait

select

waittype,

waittime,

percentage = cast(100 * waittime / @totalwait as numeric(20,1)),

signalwaittime,

percentagesw = cast(100 * signalwaittime / @totalsignalwait as numeric(20,1))

from

#waitstats

where

waittype not in ('waitfor','sleep','resource_queue','total')

order by percentage desc

select '%signal'= 100 * (@totalsignalwait / @totalwait)

drop table #waitstats

jlp3630 Hall of Fame Points: 3243 More actions · Answer 12

Cliff Jones (8/28/2008)
We have found on our 64 bit servers (especially our Itanium) that if we leave Maxdop at 0, the server spends a lot of CPU time transferring the load from one CPU to another (context switching). We actually achieved better performance by setting Maxdop to 1 (which was actually what our HP vendor reccommended).
On some of our other non-Itanium servers I found that I was able to home in on a proper value for Maxdop by tinkering with the Maxdop and the threshold values during times of high CPU utilization to home in on a better value. I did this by resetting the waitstats, let the server run for an hour or so and then rechecking the CXPacket waits and reducing the signal wait percentage. Start at Maxdop 1 for a benchmark, then Maxdop 2. At Maxdop = 2 increase the threshold to see if you can improve on the baseline.
Did you install SQL 2005 from a fresh install (rather than an in place upgrade)?

Our current database server was loaded via a fresh SQL 2005 install. However, the majority of the databases were migrated from a SQL 2000 database server.

Cliff Jones SSChampion Points: 10547 More actions · Answer 13

That's a good answer. If you upgraded then some of your default values may not be correct (like Max Worker Threads).

My guess is that if you set both your min and max memory to allow 1.5 GB or more to the OS then your problem will go away, as long as you don't have some other application running on the server. When SQL Server gobbles up the memory beyond a certain point the server begins to thrash and CPU hits the roof.

I have seen this several times with our clients servers.