SQL Server Locks Up

  • We have a SQL 2000 Enterprise Edition SP4 server dedicated to Dynamix GP.

    Every so often (about once every 10 days or so). The SQL server locks up and nobody can do anything or log into it with any tools isql, osql, em, etc. We can terminal services into the server and do anything but get at the SQL server with it.

    I suspect that there is some kind of large locking contention going on, but without a way to log into the server, I can't confirm it. Does anybody have a suggestion?



    --Mark Tassin
    MCITP - SQL Server DBA
    Proud member of the Anti-RBAR alliance.
    For help with Performance click this link[/url]
    For tips on how to post your problems[/url]

  • I like to keep an idle SA or SysAdmin session open.

    If things seem like they are locking up, try

    Select * from master..sysprocesses with (NoLock)

    and work out from there.

    If that does not work, then your problem is probably not locking.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • You could keep SQL profiler open to see what statements are being executed. Also perhaps use performance monitor to keep things such as number of transactions open, etc. Unfortunately your time frame of roughly 10 days is a long time but you could restart the monitoring each day as the previous X days of monitoring won't be of much use when the freezing occurs. I assume there's nothing in the event log?

  • Nothing in the event log, nothing on the console. The system just suddenly stops accepting new connections.



    --Mark Tassin
    MCITP - SQL Server DBA
    Proud member of the Anti-RBAR alliance.
    For help with Performance click this link[/url]
    For tips on how to post your problems[/url]

  • Check and review the dts packages, jobs. There might be a tsql program that went out of hand. If this happens on particular hours of the day, fire up your monitoring tools, perfmon, sql profiler and pull together what activities are running,..save them to a text file, or to a table. At some point before sql freezes, you may be able to catch which has caused it by identifying which one spikes too high above the usual. Or download an evaluation copy of third party utilities that display graphically real time stats of your sql health (io, mem, cpu, running sql queries, etc) and has the ability to alert and pinpoint resource intensive transactions that go beyond your normal threshold.

    _____________
    Donn Policarpio

  • I used to have this problem a long time ago, had a program that would not close connections to SQL. Over time the number of connections (spids) would be in the tens of thousands causing the server to slowly start to bog down then stop accepting any connections.

    Start off with sp_who2 before the server hits the 10 day mark, note down the number of connections. Have performance monitor running as well, lots of articles on things to watch out for in regards to SQL server.

    Check for maintenance jobs, perhaps there is one that runs every 10 days that does a full DB re-org, that sort of thing.

    SQL just doesn't lock up and stops accepting connections for no reason. Hardware overheating? Bad memory? Are you on the latest SP?

  • I agree that SQL doesn't lock up for no reason. I've never seen this before in my 8 years with SQL server. I'm lucky in that there are no jobs other than backups and integrity checks (which both run off hours) and no DTS to look at.

    I'm doing my best to keep a few tools connected to the SQL server so I can try to look at it when next it freezes up. The last freeze I was temporarily unavailable, and one of the Technicians resolved by reboot. I suspect I had a blocking process and the server itself was actually running, just jammed up.

    It doesn't help that I've been basically told that this must be a bad query that's doing this (I suspect many things, but not that) and to catch said query and report it so it can be fixed.

    So we fire up profiler and leave it running continuously, make sure SQA is connected, EM too... plus Idera's SQLCheck, SQL Spy, and SSMS just to be safe. From multiple computers.

    I was just wondering if anybody knew of a way to force a connection in when next this happens in case the grand conjuction happens and all of my connected tools lose their connections 5 seconds before.



    --Mark Tassin
    MCITP - SQL Server DBA
    Proud member of the Anti-RBAR alliance.
    For help with Performance click this link[/url]
    For tips on how to post your problems[/url]

  • Are you sure there are no 'UMS Scheduler Hung' messages in the SQL errorlog?

    This is the only thing I am aware of which will completely freeze a SQL instance. If this is the cause you are in PSS territory to get a solution for it

    ---------------------------------------------------------------------

  • In the performance monitor one thing you should definately be watching is the number of sql processor id's and connections.

    Have the other technicians keep a query analyzer open with sp_who2 ready to go. I've found that even though it appears to have hung/locked up, if the resource usage is through the roof sp_who2 can take upwards of 10 minutes or more to run.

    - How big is the database in size?

    - How many users accessing info?

    - Specs of the server?

    - Can you get the queries that are being run and throw them in query analyzer to see if the tables are properly indexed.

    - Perform dbcc checkdb's on off hours to ensure there are no corrupted indexes.

  • george sibbald (3/13/2008)


    Are you sure there are no 'UMS Scheduler Hung' messages in the SQL errorlog?

    This is the only thing I am aware of which will completely freeze a SQL instance. If this is the cause you are in PSS territory to get a solution for it

    There is nothing in the errorlog... the last entries in there are that the server successfully backed up a database approx 6 hours prior.



    --Mark Tassin
    MCITP - SQL Server DBA
    Proud member of the Anti-RBAR alliance.
    For help with Performance click this link[/url]
    For tips on how to post your problems[/url]

  • Jason Lennan (3/13/2008)


    - How big is the database in size?

    33 GB

    Jason Lennan (3/13/2008)


    - How many users accessing info?

    About 120 or so

    Jason Lennan (3/13/2008)


    - Specs of the server?

    SQL 2000 Enterprise, Windows 2003 Enterprise Edition SP2

    4 GB Ram

    2 Xeon 3 Ghz processors

    RAID 5 for the datastores 168 GB Free

    Raid 1 for the logs 111 GB Free

    Raid 1 for the server OS 124 GB free

    Jason Lennan (3/13/2008)


    - Can you get the queries that are being run and throw them in query analyzer to see if the tables are properly indexed.

    I've been doing this... their all coming out of Great Plains and seem to be hitting Index Seeks most of the time, and rarely an Index Scan... no Table Scans that I've seen so far.

    Jason Lennan (3/13/2008)


    - Perform dbcc checkdb's on off hours to ensure there are no corrupted indexes.

    Done Nightly.. with no reported errors.



    --Mark Tassin
    MCITP - SQL Server DBA
    Proud member of the Anti-RBAR alliance.
    For help with Performance click this link[/url]
    For tips on how to post your problems[/url]

  • Do you have the latest SP on SQL installed?

  • Jason Lennan (3/13/2008)


    Do you have the latest SP on SQL installed?

    Yes, it is on SQL 2000 EE SP4... sorry for not mentioning that above.



    --Mark Tassin
    MCITP - SQL Server DBA
    Proud member of the Anti-RBAR alliance.
    For help with Performance click this link[/url]
    For tips on how to post your problems[/url]

  • That's alright 🙂

    Run a sp_who2 now and again in an hour, see if the number of connections stays relatively the same. Try this every now and then and keep track.

    What type of server is this and type of Raid controller? Are the BIOS's of the server and Raid controller up to date or have any current known issues?

  • Jason Lennan (3/13/2008)


    That's alright 🙂

    Run a sp_who2 now and again in an hour, see if the number of connections stays relatively the same. Try this every now and then and keep track.

    I've been looking at it more or less once or twice a day.. it fluctuates... but is usually between 120 and 140...

    Jason Lennan (3/13/2008)


    What type of server is this and type of Raid controller? Are the BIOS's of the server and Raid controller up to date or have any current known issues?

    Dell PowerEdge 2650

    Perc 4 Raid Controller

    PowerVault 220 Array

    All of the Firmware is up to date and with the correct driver versions.



    --Mark Tassin
    MCITP - SQL Server DBA
    Proud member of the Anti-RBAR alliance.
    For help with Performance click this link[/url]
    For tips on how to post your problems[/url]

Viewing 15 posts - 1 through 15 (of 24 total)

You must be logged in to reply to this topic. Login to reply