Rebooting for a Reason

  • Comments posted to this topic are about the item Rebooting for a Reason

  • I have always believed that a periodic reboot is a good thing.  It can handle hidden issues that have existed for a while even if they have not YET caused a problem.  At my last employer it was almost verboten to reboot, especially during the week, but also even on weekends.  Such things were supposed to be scheduled with the support people ahead of time.   Always seemed to me that even scheduled tassks could be made to wait and retry, or even wait until the next cycle, assuming of course they are correctly and safely designed.  Even batch and system tasks should be able to handle scheduled reboots at a specified time.

    The cleanup aspect sort of reminds me of how I handle even manual tasks around my home.  During projects of any significant duration, I usually pause at various points to do some cleanup, reorganization, storing tools and supplies no longeer needed and in the way.  If I get to a point that a wait is required for some event, the cleanup helps keep things organized, and I expect that known reboots encourage better analysis, planning, and organizing of events.

    But of course, user support folks always knew better...even if they were at home on the weekend.

    We always got the feedback that we should not take the chance of interrupting interactive users, but I argued that a couple minute pause for a reboot is much lest disruptive than something like coffee breaks or lunch hours.   But to no avail...

    At the moment, I'm waiting (have been waiting since yesterday afternoon, to log into my credit card account.  I'm getting a message that they are 'experiencing difficulties'.  So I wait, big hairy deal.

     

    • This reply was modified 2 years, 4 months ago by  skeleton567.
    • This reply was modified 2 years, 4 months ago by  skeleton567.
    • This reply was modified 2 years, 4 months ago by  skeleton567.
    • This reply was modified 2 years, 4 months ago by  skeleton567.

    Rick
    Disaster Recovery = Backup ( Backup ( Your Backup ) )

  • I tend to agree. I've managed SQL Servers that ran for a year without rebooting in the past, but these days, I'd want to at least ensure we had some patching done a couple times a year, if not at least quarterly. Not sure about monthly, but I do know that I don't want to have too many patches stack up, either. Then I have more to test, and more to debug if things go south.

    I tend to reboot my desktop rarely, usually less than every other month and it runs fine. I do, however, get patches applied to things regularly and reboot if needed.

  • This is a very interesting article! So many ideas were brought up here, but I want to focus upon one from the blog post, "The unreasonable effectiveness of turning computers off and on again", you linked to. If I understand what Keunwoo Lee (the blogger) said, then I've had problems with the forensic analysis vs. repair, too. Over my career whenever anything went seriously bad, there has very rarely ever been any attempt at forensic analysis as to why the condition occurred that brought about the crash, hanging, etc. Instead, the action has been to restart the system ASAP. In most system failures that I wasn't involved in an analysis of what happened, I don't know if there has ever been any post event analysis of the root cause of the problem. My guess is that either no attempt was made to understand what happened or people's opinions were used to declare what went wrong. In fairness it is likely that even if people just expressed their opinions as to the root cause, they were often correct. My point is that, if analysis like this is done, then the people who did it were furthest away from the problem, so it was an educated guess at best.

    Anyway, it's my experience that most companies/organizations have a greater urgency to get things going again as fast as possible, rather than try to identify the root cause.

    But in fairness to companies/organizations root cause analysis can take a long time. How to strike a balance between getting up and running again vs. figuring out what failed in the first place, to me is more art than science.

    Kindest Regards, Rod Connect with me on LinkedIn.

  • Rod at work wrote:

    But in fairness to companies/organizations root cause analysis can take a long time. How to strike a balance between getting up and running again vs. figuring out what failed in the first place, to me is more art than science.

    Tend to agree. I usually want to try and gather some info for a RCA, but only for a limited time. Need to get things working, and if it's possible this is a one-off situation, a RCA might not be worth the effort.

    I think the talented people strike this balance better than others.

  • Rid, my experience too.  Damn the torpedoes full speed ahead ( and hope it doesn't happen again ).

    Rick
    Disaster Recovery = Backup ( Backup ( Your Backup ) )

  • A timely article.  I woke up this morning to discover my home office router had locked up overnight.  Turning it off and then on again restored functionality (seems I have to do this every 9-12 months).  Funny that it had never really sunk in that what I was doing with a power cycle was returning it to a known state.

  • skeleton567 wrote:

    Rid, my experience too.  Damn the torpedoes full speed ahead ( and hope it doesn't happen again ).

    HAHAHA! Yep, I can totally relate. And more than once I've hoped the disaster didn't come back. Sometimes it does, but I've been fortunate enough to not have it happen, often.

    Kindest Regards, Rod Connect with me on LinkedIn.

  • I think the ubiquitous advice of rebooting is has mostly to do with the economics of customer support. Attempting to provide a remote customer with step by step instructions on how to apply an update or work around a specific technical issue is time consuming and also unreliable. If rebooting has been proven to resolve the issue in the past, then why not just provide that simple advice, so you can then move on to the next customer in the queue?

    I actually have a timer on my home WiFi router's power supply, so it reboots itself each evening. So, if I'm out of town and the app for my security system reports that it's lost internet connection, then I at least know it will be resolved without my intervention within the next few hours.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply