Random lockups

  • Another suggestion it to view the event and sql error logs for errors. I have foud the same issues on servers with hyper-threading enabled and UMS scheduler hanging that was resolved with SQL 2000 hot-fix 2187. I call hotfix build 2187 my "SP5" for SQL, because it is a rollup for post SP4 fixes.


    Regards,

    Jim Higgins

  • You could also try enabling the SQL Server trace flag so that all deadlocks are recorded in the SQL log. This won't require a reboot either.

     

    dbcc traceon (1204, 3605, -1)
    go
    dbcc tracestatus(-1)
    go
  • Two suggestions: Have you thought outside the SQL Box, and do you have more than one database on your disk array?

     

    ****Think outside the SQL Server box****

     

    At my last company, we had highly transactional databases.  As you described, out of nowhere the system would “lockup”.  Also,  sp_who2 active would show lots of active users, traces seemed to show nothing, and nothing within SQL showed the root cause of the problem.

     

    I eventually found the issue by running several traces.  I did a general trace and also traced a few select employees (who were my loudest complainers).  I used the tracing events RPC:Stated, RPC:Completed, SQL:BatchStarting, SQL:BatchCompleted, SQL:StmtStarting, SQL:StmtCompleted.

     

    When the “Freezup” happened, I looked at the very last SQL commands of my traces. From what I remember, they were starting events, but no ending event. I found that a  command was connecting to an application outside of SQL Server.  We checked that system and found that it was bottlenecked.  The result was that there were SQL commands waiting to be processed by that system.  Those waiting sql processes started affecting the response time of other sql commands not using that system and SQL Server would eventually lockup.  Once that system was fixed, we stopped having the lockups.

     

    ****Disks Arrays****

     

    We had a scenario where applications using a certain database would seem to “lockup”.  After doing many traces and using disk counters perfmon, we found that there was another department that was running check runs during these times.  Both databases were on the same disk array.  Once the check runs were over, the system would go back to normal.  It has been my experience that this can happen on both an local disk array or a SAN.

     

    Bill Richards, MCSE, MCDBA

    Senior Database Analyst

  • John,

    I've not been able to get SQLDiag to actually run. All I get, after it processes the logs, while it's "Getting registry information" is this error:

    "Buffer overrun detected."

    I've searched high and low for an explanation to this error and can't find anything.

  • Oops. I just saw your offer for the Kbase # for that buffer overrun error. Sorry, my bad. Yes, that Kbase # would be very helpful.

  • We're running SP4, build 2039. Could that be a problem?

    Also, to answer an earlier question: The server is running Windows Server 2003 R2, Standard x64 Edition, Service Pack 1. It's on an IBM Xseries 366, with a 3.66GHz Xeon MP CPU with 16GB RAM. Storage is a 1.33TB drive array.

    Finally, it just looks like it's every day at 3:00 (give or take 15 minutes) that this starts. However, nothing is scheduled in the Windows Scheduled tasks, and no SQL Server Agent jobs are set to run.

    I'm now trying to set a trace up to show as much as possible prior to when this happens since during these times everything is frozen.

    Thanks for everybody's input.

  • Antivirus software performing a systemscan at that timeframe ?

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • When did this start happening?  Does that coincide with changes in the application or in daily procedures?  Is this server "linked" to other SQL boxes?  If so, are there jobs on those other servers that use data from this server?

Viewing 8 posts - 16 through 22 (of 22 total)

You must be logged in to reply to this topic. Login to reply