SQL Server Agent failure - service restarted on its own!

  • DBADave (12/29/2008)


    Since it's clustered the data is on a SAN. I'm guessing there is an equivelant tool to chkdsk, but I'm not sure I understand the connection. What would I be looking for in the chkdsk output?

    Not sure, but I do know that you can run chkdsk on SAN disks (at least from what our Windows SA told me this morning)...

    This could be grasping at straws, but the underlying cause could be disk-related.

    __________________________________________________________________________________
    SQL Server 2016 Columnstore Index Enhancements - System Views for Disk-Based Tables[/url]
    Persisting SQL Server Index-Usage Statistics with MERGE[/url]
    Turbocharge Your Database Maintenance With Service Broker: Part 2[/url]

  • Is the server you had a problem with also on the SAN?

  • DBADave (12/29/2008)


    Is the server you had a problem with also on the SAN?

    Yup, but it's not clustered.

    __________________________________________________________________________________
    SQL Server 2016 Columnstore Index Enhancements - System Views for Disk-Based Tables[/url]
    Persisting SQL Server Index-Usage Statistics with MERGE[/url]
    Turbocharge Your Database Maintenance With Service Broker: Part 2[/url]

  • I've had the same error twice now, it seems to happen every couple of months. The first time it happened I contacted Microsoft who offered no explanation let alone a fix. The only action I took was to SP the .NET Framework which has I found out yesterday made no difference.

  • When it occurs next time, could you grab a copy of the cluster log?



    Shamless self promotion - read my blog http://sirsql.net

  • Sorry I should have been clearer earlier and described the setup. We are not using clustering but are using a SAN. The SAN is connected via Fibre to the actual SQL Server. All SQL Server system databases are stored on local arrays, only the user database is kept on the SAN. Below is the exact spec :-

    Windows 2003 Enterprise Edition SP2

    SQL Server 2005 Enterprise Edition SP2

    2 x Dual Intel Xeon @ 2.0GHz

    11GB RAM (6GB is directly allocated to SQL Server)

    3TB storage via IBM SAN

    I have seen on another web site regarding this kind of error that stated the error occurs when byte data is inserted into a char field. Now none of the processes I have created involve byte data so I can only assume that if indeed this is the cause, its an internal SQL Server process that is the culprit.

  • And there are no dump files in the log directory, or informative entries in the event logs?



    Shamless self promotion - read my blog http://sirsql.net

  • The only error present is in the Event Log which states :-

    Event Type:Error

    Event Source:.NET Runtime 2.0 Error Reporting

    Event Category:None

    Event ID:1000

    Date:10/01/2009

    Time:10:20:43

    User:N/A

    Computer:DWHRSC3

    Description:

    Faulting application sqlagent90.exe, version 2005.90.3042.0, stamp 45cd6a37, faulting module kernel32.dll, version 5.2.3790.4062, stamp 46264680, debug? 0, fault address 0x0000bee7.

    For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.

    There is no other error information available in any of the log files. Not very helpful is it.

  • I think this might have to go up the chain to Microsoft product support. They will help you capture information about the install and work with you to find resolution.



    Shamless self promotion - read my blog http://sirsql.net

  • I'm getting the same thing on 2 instances of SQL 2005 Ent Ed. Both 32 bit, 9.00.3054. 2 physical hosts.

    One is mirroring a database to the other. Last week the SQL Agent on the one host restarted on its own. This morning the SQL Agent on the other one stopped on its own. It started up ok (manually).

    Please let us know if you find out anything further.

    Thanks

    -J

  • If this happens to us again I will call Microsoft.

  • Here's an update. I opened a case with Microsoft in late January and unfortunately it is still open. They now want me to run a lightweight version of PSSDiag and install a special utility that will create a dump of the SQL Server Agent when the problem next occurs. I'm not sure I like the idea of keeping PSSDiag running for weeks until the problem happens again, even if it is a lightweight version. Microsoft had me update registry settings the last two months in hopes of producing a dump but that never worked. In one case their settings caused server problems.

    I did finally find a pattern. The problem occurrs while a large number of deadlocks are occurring. We have an Alert setup for deadlocks and the alert starts a SQL job to insert WMI data into a table. I believe the problem is WMI-related and have been trying to convince Microsoft to involve their WMI team on the case. They finally agreed to do so today. I'm curious if other people experiencing the same SQL Server Agent problem have also noticed deadlocks at the time of the agent restart.

    Dave

  • Hi Dave

    Thanks for the post.

    An update from my side:

    1) Our SQL Agents have not stopped since my last post (which is good but illustrates that this seems to have mysteriously "gone away" from our side)

    2) I hadn't noticed heavy locking (i.e. it may have been there I wasn't looking for heavy locking - rather alerting only on blocking activity)

    3) We have a couple of Windows/SQL servers now with WMI problems. No issues with the SQL agents but interesting that it started at about the same time as the WMI issues (on different servers).

    Oh well...

    Please keep us all updated with how you go.

    -J

  • I'm curious if your WMI problems are the same as ours. Only on our 64bit servers we have noticed that periodically WMI appears to stop working. We have Spotlight On Windows, by Quest, running all of the time, which is basically using Perfmon data. On occassion we can't retrieve performance information on some of the 64bit servers. We've found that Perfmon also does not work when Spotlight has issues and the solution is either to reboot the database server or sometimes we have been able to restart WMI to correct the problem.

  • Hi

    We also found that WMI stopped responding but it is on Windows 2000 and 32 bit.

    There is a utility to check if WMI is problematic (let me know if you want it and I'll dig it out).

    The fix worked for us and involved the following few steps:

    http://msmvps.com/blogs/lduncan/articles/20217.aspx

    I recommend restarting the server after the fix as we found a few old SPIDs hanging around from the old corrupt WMI which a SQL restart would fix but I'd recommend freshening up the server if you can afford the outage.

    We don't use Spotlight on this particular server, rather Idera diagnostic manager.

    Regards

    J

Viewing 15 posts - 16 through 30 (of 31 total)

You must be logged in to reply to this topic. Login to reply