After sql cluster node started - 4hrs later sql DBI service terminated

Question

Post reply

After sql cluster node started - 4hrs later sql DBI service terminated

Spud312R

SSC-Addicted

Points: 462
More actions
July 25, 2014 at 7:22 am

#310309

During a ms patch maintenance window, 2 node sql cluster, 1 node shutdown and patched and rebooted, 2nd node shutdown and patched and booted.
During startup all looked fine, then 4hrs later i get 2 sql services crash - see below
Sql services are set to start manually - by cluster mgr.
Any idea's why it would take so long to crash the services ?
Could it have been a SAN issue for the sql errorlogs ??
DB2
6.02.22 system restart
6.09.10 system started
6.06.19 error cluster service did not shutdown properly
after receiving a preshutdown control id 7043
6.10.06 cluster service started
DB3
6.06.45 failover cluster db2 removed from cluster event id 1135
6.29.07 system shutdown
6.32.16 cluster service started
6.32.20 sql reporting service started
6:32:44 system started
.
.
10.39.24 sql DBI service terminated
with service-specific error %%17058 id 7024
Could not open error log file 'Z:\MSSQL10_50.DBI\MSSQL\Log\ERRORLOG'
10.39.39 sql DBI_RS service terminated
with service-specific error %%17058
Could not open error log file 'E:\MSSQL10_50.DBI_RS\MSSQL\Log\ERRORLOG'
10.39.48 sql agent (DBI) failed sdue to sql server dbi failure
10.39.57 sql agent (DBI_RS) failed sdue to sql server dbi_rs failure
10.47.08 sql reporting services stopped
10.47.12 sql reporting services started
11:14:41 rebooted db3 again
11:17:27 system up all fine
SQL Startup params:-
DBI_RS on 3p
-dE:\MSSQL10_50.DBI_RS\MSSQL\DATA\master.mdf;
-eE:\MSSQL10_50.DBI_RS\MSSQL\Log\ERRORLOG;
-lE:\MSSQL10_50.DBI_RS\MSSQL\DATA\mastlog.ldf
DBI on 2p
-dZ:\MSSQL10_50.DBI\MSSQL\DATA\master.mdf;
-eZ:\MSSQL10_50.DBI\MSSQL\Log\ERRORLOG;
-lZ:\MSSQL10_50.DBI\MSSQL\DATA\mastlog.ldf

Viewing 4 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply

arnipetursson SSCertifiable Points: 6557 More actions · Answer 1

Do you have a job that rolls the error log at 10:39?

Did something happen to the permissions on the log folder?

Spud312R SSC-Addicted Points: 462 More actions · Answer 2

arnipetursson (7/25/2014)
Do you have a job that rolls the error log at 10:39?
Did something happen to the permissions on the log folder?

"Did something happen to the permissions on the log folder?"

I heard that permissions had changed some how, but after the second reboot, it was ok

Rudyx - the Doctor SSC-Forever Points: 43695 More actions · Answer 3

hmmm ... it sounds like you have a 2 node active/active cluster here.

if these were just OS patches and not SQL Server related shutting down SQL on a node, applying the OS patches and rebooting should have worked just fine one node at a time.

did you find any other errors inside the Windows logs or the cluster logs ?

RegardsRudy KomacsarSenior Database Administrator"Ave Caesar! - Morituri te salutamus."