Multi instance server, randomly unconnectable, no errors anywhere...help

  • Hello

    Apologies in advance, not sure this is the right section, but as my teterminology will most likely be poor, I'm thinking you'll cut me some slack being the newbie forum 🙂

    My problem is pretty straight forward, randomly (sometimes once a week, sometimes once a month) one SQL server (SQL2008) loses connections to all apps connecting to all instances it holds. I cannot connect via SSMS (apologies, forget error code, but generic unable to connect to instance message, check instance is up, no network issues etc..) also, but if I log onto the server itself and open SMSS, I can connect fine. I was rebooting the server, but this was time consuming and I really didn't like doing this, so, although no errors and green light present, I restart the SQL Server Browser and viola, all back up and running.

    I've checked each instance SQL logs, no errors, MS Event logs, no errors. I know SQL is supposed to consume all memory, but just incase, I capped each instance so the server always has 6GB RAM free, so that's def not the issue either.

    I'm at a loss to seeing what this can be...any pointers of a better way of monitoring or diagnosing this issue?

    Many thanks in advance.

    (p.s, I look after 8 SQL servers, no other server has this issue, 4 of them have the same OS and SQL version)

    Mark

  • In addition, I sometimes get the following on one/some instances

    Login failed for user DOMAIN\ServiceAcc. Reason: Token-based server access validation failed with an infrastructure error. Check previous errors.

    There are no other errors. I think this is simply a result of SQL Server Browser unable to access or allow access to anything non local.

    Thanks

    Mark

  • Are you running the browser as a domain account or as network service?

  • Local service

  • By default the browser runs under NetworkService so unsure if changing it other may have some adverse effects or not. One to investigate further.

  • Apologies, yes that's what I meant by local, the NT AUTHORITY\LOCAL SERVICE account.

    Can't see that being a problem due to the issue occurring as intermittently as I'm getting though. All my other servers are also using that account with no issue. Maybe wrong, but I installed my first couple of servers using the SQL 2008 MS exam ready book as a guide. If I remember rightly it was fine to use this account.?

    Thanks

  • It should run as NT AUTHORITY\NetworkService by default not NT AUTHORITY\LocalService. The two accounts have slightly different permissions.

    When the fault happens is there any fault with DNS or AD around the same time, due to the login token error you detailed?

  • Ok, that's odd as I'm pretty sure I left as default...

    No, nothing. And it's not always there and never on all instances, very weird. Although it only takes me seconds to fix, it affects a customer purchasing suite, so not great to find out by users not being able to connect.

    Is there anywhere else I can look for logs that wouldn't be where I've already looked? I've thought about running a trace/monitor, but if it takes a month to happen.. :s Bit lost as what to do.

  • Can you connect using IP address,port instead of instance name? That would tell you if the problem was directly related to the browser, or if the browser lost connectivity to the instance. Good idea to leave over enough memory for the server itself, if the problem was some process on the server itself, could be very hard to debug.

  • I'm going to show my amatuerness now...can you connect to an instance using IP? For example 172.20.2.200\instanceA? If so, no but good idea, I'll happily give that a go.

    Ok thank you, thought as much...nightmare.com

  • Ignore above^ just tested IP/instance and it worked fine. I'll give that a shot.

    If I can, you think it's some DNS issue? Just seems so random and intermittent to be DNS..

  • Apologies, ignore the above^ I just tested and IP\instance works fine.

    If I can connect the next time, do you suspect DNS? I'd be surprised myself tbh.

  • Oops, what a noob, thought my post was chewed up...didn't see page 2.

  • P.s, just checked 4 servers, all have SQL Server Browser running as NT AUTHORTY\LOCAL SERVICE and I'm not changing them. It's def the default.

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply