Windows patching DBA list

  • Hi,

    When the infrastructure team patches windows server and reboots, I would like to have a solid checklist of things which should be done to say 'sql server and databases are running fine'. I usually connect via ssms and connect to multiple instances which were on the patched windows server and run jobs and confirm databases are doing fine, check services are running. However, I know this is not enough so I was wondering is there any checklist or tips which a DBA should follow after infrastructure team patches windows servers.

    Secondly, team repairs sql server because of the SSIS jobs failing after windows patching. Now surprisingly after SQL repair the jobs do notfail. Possible bug who knows. please advise and guide

    Thank you!

     

     

  • For the first one, you ought to have a list of checks, and automate these. These are smoke tests, and would be what you'd want to check each time you update your database for an application. In terms of does SQL Server work, I would assume you have some monitoring in place of production instances. This should be enough. I've never done more than that here. Windows updates shouldn't affect SQL Server itself. Might alter the network stack or IO, but not the way the database works.

     

    For repairs, not sure what you mean.

  • Thank you for your advise. Could you please elaborate on Smoke tests.

    Yes, we do have SQLcerntry one client for Prod instance ONLY so that does show CPU usage etc.

     

  • Smoke Tests - http://softwaretestingfundamentals.com/smoke-testing/

    The idea is that you decide what things need to be working for your application to be stable. This might be

    • The service is running
    • Any user can connect to the SQL Server
    • A specific user can connect to a specific database
    • A user can connect and run some query that is important
    • Some load script can run to warm cache
    • Anything else

    You decide what's important for your app to run and script that out. Run this with error handling each time to system reboots or a deployment completes. You would do this to ensure that things are working correctly.

    You won't know everything you need all at once, so script this to be easily enhanced over time. If this is just a successful connection to SQL Server, start there. As you need to check if your system works, add checks that can run as a group.

     

  • sizal0234 wrote:

    When the infrastructure team patches windows server and reboots, I would like to have a solid checklist of things which should be done to say 'sql server and databases are running fine'. I usually connect via ssms and connect to multiple instances which were on the patched windows server and run jobs and confirm databases are doing fine, check services are running. However, I know this is not enough so I was wondering is there any checklist or tips which a DBA should follow after infrastructure team patches windows servers. 

    Steve's observations about "Smoke Testing", which is kind of what you're already doing, are spot on.  However, they are just "Smoke Tests".  Unless you do "Full Regession Testing", which may require days of downtime, you'll never catch it all.  We've had a couple of things (usually having to do with 3rd party software) fail consistently after Windows Updates.  Fortunately, we've been lucky enough to be able to identify which patch is causing the problem and just roll that patch back.  Of course, we contact the vendor to tell them what they need to patch but that sometimes takes months and so, when it happens, we add the required rollback to the list of things to do with Windows and SQL Server updates until the vendor can make a patch.

    Sometimes, it's MS we have to tell because they're not perfect, either.

    This almost never affects us in production.  We always wait for a week (to see if others run into a problem) after "Windows Tuesdays" and then do the updates to only the Dev boxes.  If nothing craps out after a week, we deploy to the QA/UAT boxes and wait another week.  If nothing craps out there, then and only then do we deploy to prod.

    sizal0234 wrote:

    Secondly, team repairs sql server because of the SSIS jobs failing after windows patching. Now surprisingly after SQL repair the jobs do notfail. Possible bug who knows. please advise and guide

    You wouldn't like my advice, which would be to stop using SSIS, like I did.  In fact, I avoided old DTS like the plague and have never used SSIS to do anything and, in a previous job, I was instrumental in replacing all the things they were using SSIS for because of complaints of performance and a raft of other problems 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thank you All..It helps and will plan and have custom list for my environment.

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply