SQL crash - debug help needed

  • [font="Courier New"]tl;dr: How to debug a crash of the SQL service?

    We are experiencing a SQL crash that seems to be working on a regular schedule now. It has crashed twice in the last week. Indications are that it may be leading up to a 3rd crash.

    The SQL server is SQL 2008 SP1 64bit Standard Edition (10.0.2573.0), running on a VMWare virtual server. It serves as the db server for a SharePoint 2010 QA environment.

    Windows Event Logs:

    10/30/2011 10:06PM The MSSQLSERVER service terminated unexpectedly.

    11/02/2011 11:44AM The MSSQLSERVER service terminated unexpectedly.

    In the SQL logs folder, there is an EXCEPTION.log file dated 10/30/2011 10:05PM with the contents:

    10/30/11 22:05:33 spid 0 Exception 0xc0000005 EXCEPTION_ACCESS_VIOLATION writing address 000000000086A800 at 0x00000000009E6BCA

    10/30/11 22:05:43 spid 0 Exception 0xc0000005 EXCEPTION_ACCESS_VIOLATION writing address 0000000001957B70 at 0x00000000009E6BCA

    We did not get one for the second crash.

    In the SQL logs folder, there are SQLDump000x.log files (1-3) and SQLDump000x.mdmp (1-3) dated 10/30/2011 10:05PM relating to the first crash.

    We only have one SQLDump0004.log and mdmp file for the second crash, but also SQLDUMPER_ERRORLOG.log (none such for the first crash, maybe overwritten?).

    There is also a log_1052.trc file dated approx 6 hours preceeding the second crash. No trace files preceeding the first crash. As you can see in the file list below (date order), the trace files are continuing to be written out, which leads me to believe that they are counting down to another crash. I cannot open any of the trace files - "Access is Denied" - as if they are in use.

    Where do I start?

    Directory of C:\Program Files\Microsoft SQL Server\MSSQL10.MSSQLSERVER\MSSQL\Log

    10/13/2010 05:14 PM 630 FDLAUNCHERRORLOG.6

    10/14/2010 11:18 AM 630 FDLAUNCHERRORLOG.5

    10/18/2010 01:41 PM 630 FDLAUNCHERRORLOG.4

    10/18/2010 02:47 PM 630 FDLAUNCHERRORLOG.3

    10/19/2010 09:59 AM 630 FDLAUNCHERRORLOG.2

    10/19/2010 09:59 AM 514 FDLAUNCHERRORLOG.1

    10/20/2010 10:26 AM 630 FDLAUNCHERRORLOG

    10/29/2010 09:00 AM 1,572,864 WSS_SicpaPortalQa_Content_01_log.LDF

    02/14/2011 09:53 AM 2,832 SQLAGENT.9

    02/18/2011 03:10 AM 5,950 SQLAGENT.8

    03/18/2011 03:17 AM 5,950 SQLAGENT.7

    06/23/2011 02:13 AM 2,628 SQLAGENT.6

    06/23/2011 02:43 AM 83,404 ERRORLOG.6

    06/23/2011 02:43 AM 4,392 SQLAGENT.5

    07/21/2011 02:08 AM 5,950 SQLAGENT.4

    07/21/2011 02:08 AM 2,579,750 ERRORLOG.5

    09/21/2011 02:10 AM 5,950 SQLAGENT.3

    09/21/2011 02:10 AM 3,223,004 ERRORLOG.4

    10/14/2011 02:08 AM 5,950 SQLAGENT.2

    10/14/2011 02:08 AM 1,572,160 ERRORLOG.3

    10/30/2011 10:05 PM 44,096 SQLDump0001.txt

    10/30/2011 10:05 PM 518 exception.log

    10/30/2011 10:05 PM 65,536 SQLDump0001.log

    10/30/2011 10:05 PM 7,406,619 SQLDump0001.mdmp

    10/30/2011 10:05 PM 44,786 SQLDump0002.txt

    10/30/2011 10:05 PM 65,536 SQLDump0002.log

    10/30/2011 10:05 PM 5,232,805 SQLDump0002.mdmp

    10/30/2011 10:05 PM 46,856 SQLDump0003.txt

    10/30/2011 10:05 PM 5,242,944 SQLDump0003.mdmp

    10/30/2011 10:05 PM 65,536 SQLDump0003.log

    10/30/2011 10:05 PM 849,424 ERRORLOG.2

    10/30/2011 10:06 PM 4,542 SQLAGENT.1

    11/02/2011 06:07 AM 20,971,520 log_1054.trc

    11/02/2011 11:44 AM 46,856 SQLDump0004.txt

    11/02/2011 11:44 AM 4,417,119 SQLDump0004.mdmp

    11/02/2011 11:44 AM 65,536 SQLDump0004.log

    11/02/2011 11:44 AM 27,286 SQLDUMPER_ERRORLOG.log

    11/02/2011 11:44 AM 210,278 ERRORLOG.1

    11/02/2011 11:44 AM 18,696,704 log_1055.trc

    11/02/2011 11:44 AM 4,542 SQLAGENT.OUT

    11/02/2011 10:05 PM 20,971,520 log_1056.trc

    11/03/2011 02:47 AM 153,842 ERRORLOG

    11/03/2011 03:26 AM 20,971,520 log_1057.trc

    11/03/2011 03:26 AM 0 log_1058.trc

    11/03/2011 08:48 AM <DIR> .

    11/03/2011 08:48 AM <DIR> ..

    45 File(s) 114,680,999 bytes

    2 Dir(s) 14,907,080,704 bytes free

    [/font]

  • from my point of view, it looks like disk issue which is getting network problem.

    Is it possible for you to move system database files to some other disk. Also dont forget to take the full backup of your crucial databases as you never know when your sql will crash again.

    ----------
    Ashish

  • Call Microsoft customer support. They're the ones that have the tools to debug crash dumps. The Exception Access Violation has nothing whatsoever to do with disk or network, it's an error that Windows throws when a process attempts to read or write memory that it is not permitted to access (another process's memory, kernel memory, etc).

    The trace files are the default trace, they're always running. Maximum of 5 files of 20MB each. Access denied could mean your account doesn't have permission or it could be that SQL has them locked open (they are files for an active trace)

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • yeah i know, but i was trying to avoid going through MS support. :crazy:

    funny, i have never noticed those trc files before. what are they doing and why is there a 5 file limit?

    i'll try to update this thread once we get to the bottom of the problem.

  • OLDCHAPPY (11/4/2011)


    yeah i know, but i was trying to avoid going through MS support. :crazy:

    Then download the windows debugger and open up the trace files. I hope you're up to speed with Assembler.

    Point is, CSS has the tools to investigate this kind of thing, they have source code access and access to the private symbols, we don't. Also, this is generally the result of a bug, either in SQL Server or in something that's running in-process with SQL Server or with something else on the server.

    funny, i have never noticed those trc files before. what are they doing and why is there a 5 file limit?

    Books Online - Default Trace

    p.s. You should patch that instance to the latest service pack.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • OLDCHAPPY (11/3/2011)


    [font="Courier New"]

    Windows Event Logs:

    10/30/2011 10:06PM The MSSQLSERVER service terminated unexpectedly.

    11/02/2011 11:44AM The MSSQLSERVER service terminated unexpectedly.

    In the SQL logs folder, there is an EXCEPTION.log file dated 10/30/2011 10:05PM with the contents:

    10/30/11 22:05:33 spid 0 Exception 0xc0000005 EXCEPTION_ACCESS_VIOLATION writing address 000000000086A800 at 0x00000000009E6BCA

    10/30/11 22:05:43 spid 0 Exception 0xc0000005 EXCEPTION_ACCESS_VIOLATION writing address 0000000001957B70 at 0x00000000009E6BCA

    I could be wrong but this looks like a bad memory chip.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (11/4/2011)


    I could be wrong but this looks like a bad memory chip.

    Bad memory is one possibility, also bug in a kernel driver (could be just about any one on the server), memory scribbler, buggy linked server driver, bug in SQL. Even possibly faults in the memory controller.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply