Blog Post

Powershell, when were my servers rebooted?

,

One of my servers wasn’t rebooted when it should have been and I never realized it until after the outage was over.  The big deal wasn’t that it didn’t reboot, it was that I never realized it.  These are the things that bug me, so I used it as an excuse to learn a bit of PowerShell.

A little backstory to give the details a better chance of making sense to everyone…  Instead of having a couple overworked sys admins doing windows updates for every server, we each take care of our own with a sys admin checking that all updates were applied.  It’s the first I’ve heard of it being done this way, but I like it.

To keep things consistent, the database team has a policy of rebooting all database servers regardless of WSUS requiring it.  One month most of the servers required a reboot and bounced automatically, except for one that didn’t require a reboot.  After the first 10 or 15 rebooted, I got lazy and started just verifying the servers were up, SQL was responding, all updates were applied, and there was no pending reboot.

It wasn’t until I was reading the logs the next business day I realized that that server was missing all of the chatter from SQL Services starting up.  Like I said, the biggest issue was me being frustrated that it snuck by me, and I didn’t want that to happen again.  I wanted my solution to be independent of SQL Server because not all of the computers I do updates on have a SQL engine.

Wanting an excuse to learn PowerShell better, I talked to our Exchange guy (they know their PowerShell) and got a basic script to determine a computer’s boot time given its name.  I expanded that to be an entire AD group, then a list of AD groups, and finally using wildcards to find AD groups.  Wildcards work best for me because I’m reorganizing my SUS groups and all of mine have “SQL” in the groups name, even for the app servers I’m responsible for.

That resulted in the following code.  Now I’m absolutely certain that a PowerShell or AD guru can poke a ton of holes in my logic, but that’s what we have gurus for!  To be fair, I’d do the same to their T-SQL.  If any of those gurus are out there reading this DBA’s blog, the comment section below is how you fix my code and stop me from spreading bad code like a virus.

Import-Module ActiveDirectory
#either method works for getting a list of groups.  You can type in all of your groups or make a query to find them all.  I prefer the query if possible.
#$groups = "Test SQL Servers", "Prod SQL Servers 1", "Prod SQL Servers 2", "SQL Cluster Servers 1", "SQL Cluster Servers 2"
$groups = Get-ADGroup -Filter {name -like "*SQL *"} | where-object {$_.distinguishedname -like "*OU=SUS Group*"}
ForEach ($group in $groups) {
    $computerlist = Get-ADGroupMember $group -Recursive | SELECT name 
    #If there were any computers in the list, do this.  It skips empty groups.
    if ($computerlist) {
        $computerlistcount = $computerlist.Count
        #It returns an object instead of an array if there was only one, so count would be null
        if (!$computerlistcount) {
            $computerlistcount = 1
        }
        Write-Host ""
        Write-Host "$computerlistcount server(s) in" $group.name
    }
    ForEach ($computer in $computerlist) {
        try {
            Get-WmiObject win32_operatingsystem -ComputerName $computer.name -ErrorAction Stop | select csname, @{LABEL='LastBootUpTime';EXPRESSION={$_.ConverttoDateTime($_.lastbootuptime)}}
        }
        catch {
            #This logic is lacking.  I don't care what error you got, just say you couldn't connect and we'll move on.
            #For this script, it probably means the server is in the middle of a reboot
            Write-Host "Could not connect to" $computer.name
            Continue
        }
    }
}
 

There are relatively equivalent things you can do in SQL Server, but that’s assuming you’re running a SQL Server engine on every server you care about.  Even from the viewpoint of a DBA running the SUS updates on my own servers, this doesn’t hold up.  I still care about my monitoring application server and several others.

That being said, here’s what SQL can do for you:

SELECT create_date FROM sys.databases WHERE database_id = 2
exec xp_cmdshell 'wmic os get lastbootuptime'

There’s also some more bad news with these beyond requiring the SQL Server engine.

The create date of TempDB is useful, but it’s created on startup of SQL Services, not the computer.  This means if you restart SQL Services then you get a new date, even if you specifically state that you want to restart every computer during your SUS outage whether it’s required or not.

The WMIC call gets around that and tells you when the server came up, but it’s relying on xp_cmdshell.  This extended procedure has a bad rep for being a security loophole, even though by default to run the thing you need enough permissions to be able to turn it on.  Because of that, it’s turned off in many environments and flipping it on for a second can be frowned upon.

Original post (opens in new tab)
View comments in original post (opens in new tab)

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating