Re: Intermittent slow database resposne

  • Just for my own information, why did you choose 7168 as the maximum server memory?

    7168MB / 1024 = 7GB

    You should have at least 1GB for the OS. Gila recommended 1.5.

    Any interesting on the performance counters?

  • DavidB (8/27/2008)


    Very sorry about that. The link I was referring to was http://support.microsoft.com/kb/918483.

    Not so sure that applies based on the errors that you provided.

    Have you installed the performance dashboard as recommended by an earlier response? I would be interested in seeing what you are seeing as your primary wait types.

    select * from sys.dm_os_wait_stats order by wait_time_ms desc

    I logged into the server this morning, and everything seems to be running ok. In other words, SQL server queries are running fast, and RDC is running at normal speed consistently. I didn't do anything to fix the problem, so I guess it "magically" fixed itself.

    Here's a snapshot summary of the current task manager/performance tab:

    PF Usage: 8.06GB

    Physical Memory (K)

    Total: 8383116

    Available: 208096

    System Cache: 676940

    Commit Charge (K)

    Total: 8458284

    Limit: 11761148

    Peak: 11022620

    Here's the current wait times from the performance dashboard:

    Wait CategoryNumber of WaitsWait Time (sec)% Wait Time

    Sleep92258898132896.68651.79%

    Parallelism3676062657328.21822.34%

    Other20325494333374.99413.01%

    Buffer IO 255845424789.5619.66%

    Logging30226372843.078 1.11%

    Lock843282281.701 0.89%

    Latch8692461280.718 0.50%

    Scheduler Yield163141311243.343 0.48%

    Network IO26328362.1870.14%

    Buffer Latch3763003180.3280.07%

    Memory50611.6250.00%

    Here's the currents results from sys.dm_os_wait_stats:

    wait_typewaiting_tasks_countwait_time_msmax_wait_time_mssignal_wait_time_ms

    SQLTRACE_BUFFER_FLUSH16216648605784281546

    LAZYWRITER_SLEEP668716482268712963078

    CXPACKET36751505573036875323901913500

    PAGEIOLATCH_SH2515588237599684015132578

    BROKER_TASK_STOP27641369584310015500

    BROKER_RECEIVE_WAITFOR36748745360000015

    IO_COMPLETION5978815658250240610203

    WRITELOG293624526696714437116312

    OLEDB2012845441756562526870

    LCK_M_U830121738359200002328

    BACKUPIO1142998162328120001500

    LATCH_EX8688841280375906186546

    SOS_SCHEDULER_YIELD1604464912371877501234000

    SLEEP_TASK9209496811910934371142906

    ASYNC_IO_COMPLETION10011475933497180

    BACKUPBUFFER22241811168909531890

    PAGEIOLATCH_EX395699049372421203

    BACKUPTHREAD99981479636446815

  • I logged into the server this morning, and everything seems to be running ok. In other words, SQL server queries are running fast, and RDC is running at normal speed consistently. I didn't do anything to fix the problem, so I guess it "magically" fixed itself.

    Sounds like a good time to dig into looking at memory and disk performance.

    Also, what is the wait in the parallelism?

  • sam (8/27/2008)


    I logged into the server this morning, and everything seems to be running ok. In other words, SQL server queries are running fast, and RDC is running at normal speed consistently. I didn't do anything to fix the problem, so I guess it "magically" fixed itself.

    Sounds like a good time to dig into looking at memory and disk performance.

    Also, what is the wait in the parallelism?

    Here's the info on parallelism:

    Wait TypeNumber of WaitsWait Time (sec)% Wait timeMax Wait Time (ms)Avg Wait Time (ms)

    CXPACKET3684792357496.859100.00%5323901.6

  • sam (8/26/2008)


    http://www.google.com/search?q=troubleshooting+sql+memory+problems&sourceid=ie7&rls=com.microsoft:en-US&ie=utf8&oe=utf8

    The first link has a few counters you could look at.

    Is this the website that you were referring to regarding performance counters:

    http://sqlserver-qa.net/blogs/perftune/archive/2007/06/19/memory-issues.aspx

    Thanks again for all your input.

  • jlp3630 (8/27/2008)


    I'm running RAID-0 on my c: drive (contains Windows, SQL server, master, model and msdb databases and log files).

    I'm running RAID-50 on my f: drive (tempdb, user databases and log files).

    RAID 50? That's the first time I've heard of anyone using that Raid level.

    Just curious, why RAID 50?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (8/27/2008)


    jlp3630 (8/27/2008)


    I'm running RAID-0 on my c: drive (contains Windows, SQL server, master, model and msdb databases and log files).

    I'm running RAID-50 on my f: drive (tempdb, user databases and log files).

    RAID 50? That's the first time I've heard of anyone using that Raid level.

    Just curious, why RAID 50?

    The server was setup before I started working for this client. I have no idea why it's setup that way. When I find out, I'll let you know.

    Also, my c: drive is RAID-1 not RAID-0.

  • On our 64 bit servers we set the memory to allow 1.5 - 2 GB for the OS as Gail recommended and set Maxdop at 1 or 2. If you leave Maxdop at 0 or set it to some higher value (than 1) then set the cost threshold up to a higher value (10 or more).

    It looks like parallelism might be a problem based on your waitstats (did you reset them?).

  • Cliff Jones (8/28/2008)


    On our 64 bit servers we set the memory to allow 1.5 - 2 GB for the OS as Gail recommended and set Maxdop at 1 or 2. If you leave Maxdop at 0 or set it to some higher value (than 1) then set the cost threshold up to a higher value (10 or more).

    It looks like parallelism might be a problem based on your waitstats (did you reset them?).

    Cliff,

    Just for my own knowledge, could you explain the logic behind these two suggestions:

    1. set Maxdop to 1 or 2

    2. maxdop at 0 or higher than 1 & cost threshold > 10

    I didn't reset my wait stats when I posted these sys.dm_os_wait_stats results:

    wait_type waiting_tasks_count wait_time_ms max_wait_time_ms signal_wait_time_ms

    SQLTRACE_BUFFER_FLUSH 16216 64860578 4281 546

    LAZYWRITER_SLEEP 66871 64822687 1296 3078

    CXPACKET 36751505 57303687 532390 1913500

    PAGEIOLATCH_SH 2515588 23759968 4015 132578

    BROKER_TASK_STOP 2764 13695843 10015 500

    BROKER_RECEIVE_WAITFOR 36 7487453 600000 15

    IO_COMPLETION 597881 5658250 2406 10203

    I just reset them for further monitoring.

    Thanks for your input.

    Jon

  • We have found on our 64 bit servers (especially our Itanium) that if we leave Maxdop at 0, the server spends a lot of CPU time transferring the load from one CPU to another (context switching). We actually achieved better performance by setting Maxdop to 1 (which was actually what our HP vendor reccommended).

    On some of our other non-Itanium servers I found that I was able to home in on a proper value for Maxdop by tinkering with the Maxdop and the threshold values during times of high CPU utilization to home in on a better value. I did this by resetting the waitstats, let the server run for an hour or so and then rechecking the CXPacket waits and reducing the signal wait percentage. Start at Maxdop 1 for a benchmark, then Maxdop 2. At Maxdop = 2 increase the threshold to see if you can improve on the baseline.

    Did you install SQL 2005 from a fresh install (rather than an in place upgrade)?

  • I don't remember where I got this but here is the query I use to get the Signal Wait percentages:

    create table #waitstats (

    waittype varchar(80),

    requests numeric(20,1),

    waittime numeric (20,1),

    signalwaittime numeric(20,1))

    insert into

    #waitstats (waittype, requests, waittime, signalwaittime)

    select

    wait_type, waiting_tasks_count, wait_time_ms, signal_wait_time_ms

    from

    sys.dm_os_wait_stats

    declare @totalwait numeric(20,1),

    @totalsignalwait numeric(20,1) ,

    @endtime datetime,

    @begintime datetime

    --get the totals

    select

    @totalwait=sum(waittime), @totalsignalwait=sum(signalwaittime)

    from

    #waitstats

    --- subtract waitfor, sleep, and resource_queue from Total

    select

    @totalwait = @totalwait - sum(waittime), @totalsignalwait = @totalsignalwait - sum(signalwaittime)

    from

    #waitstats

    where

    waittype in ('Waitfor','Sleep','Resource_Queue')

    -- insert adjusted totals, rank by percentage descending

    insert into #waitstats select '***total***', 0, @totalwait, @totalsignalwait

    select

    waittype,

    waittime,

    percentage = cast(100 * waittime / @totalwait as numeric(20,1)),

    signalwaittime,

    percentagesw = cast(100 * signalwaittime / @totalsignalwait as numeric(20,1))

    from

    #waitstats

    where

    waittype not in ('waitfor','sleep','resource_queue','total')

    order by percentage desc

    select '%signal'= 100 * (@totalsignalwait / @totalwait)

    drop table #waitstats

  • Cliff Jones (8/28/2008)


    We have found on our 64 bit servers (especially our Itanium) that if we leave Maxdop at 0, the server spends a lot of CPU time transferring the load from one CPU to another (context switching). We actually achieved better performance by setting Maxdop to 1 (which was actually what our HP vendor reccommended).

    On some of our other non-Itanium servers I found that I was able to home in on a proper value for Maxdop by tinkering with the Maxdop and the threshold values during times of high CPU utilization to home in on a better value. I did this by resetting the waitstats, let the server run for an hour or so and then rechecking the CXPacket waits and reducing the signal wait percentage. Start at Maxdop 1 for a benchmark, then Maxdop 2. At Maxdop = 2 increase the threshold to see if you can improve on the baseline.

    Did you install SQL 2005 from a fresh install (rather than an in place upgrade)?

    Our current database server was loaded via a fresh SQL 2005 install. However, the majority of the databases were migrated from a SQL 2000 database server.

  • That's a good answer. If you upgraded then some of your default values may not be correct (like Max Worker Threads).

    My guess is that if you set both your min and max memory to allow 1.5 GB or more to the OS then your problem will go away, as long as you don't have some other application running on the server. When SQL Server gobbles up the memory beyond a certain point the server begins to thrash and CPU hits the roof.

    I have seen this several times with our clients servers.

Viewing 13 posts - 16 through 27 (of 27 total)

You must be logged in to reply to this topic. Login to reply