Operational Acceptance Testing

  • As production DBAs providing support for new projects (particularly the database back-end subsystem obviously) we are often asked to draw up an OAT plan. Do folks have standard OAT plans/checklists of their own and is there any mileage in setting up a testing forum/workspace/whatever to compare our test plans and debate what SQL Server System or Operations Acceptance test standards should include/exclude?

    To get the ball rolling I include a task list below:

    Tasks

    Backup testing:

    * Check Backups are being created as specified

    * Check Maintenance Plans creating backups are scripted and documented

    * Check Scripts written to push backups from server to backup server work[1]

    * Test restoring DB to point in time

    Log Shipping:

    * Check setup log shipping scripts (& associated blueprint)

    * Test log shipping failure and associated System Management[2] alerts

    * Test stopping and restarting log shipping

    Log Shipping Failover testing:

    * Test failover processes

    * Test re-prime processes

    * Test System Management[2] task failover

    Clustering:

    * Check cluster install blueprint[3]

    * Test cluster heartbeat failure recovery

    * Test quorum drive failure recovery

    * Test mount point failure

    * Test node failure

    * Test cluster failover alerting

    * Test cluster failover effect on client application(s)

    DR testing:

    * Check scripted database rebuild (& associated blueprint)

    * Check data restore from offline backup

    * System test DR'd system

    DB Performance OAT: [4]

    * Record table hits, read/write ratios

    * Record average, peak and trough TPS

    * Profile worst/most frequent sql

    Standard Monitoring and Maintenance testing:

    * Check standard monitoring alerts and maintenance jobs and utilities are installed and are latest certified version.

    * Test standard alerts coming through to corporate 24/7 alerting framework

    Notes:

    [1] Only appropriate where tape backups are being made from different server to database server

    [2] Whichever automated alerting and system management framework used in your shop be it MS's SMS or IBM's Tivoli, or whatever

    [3] Roll on Win2003 clustering where we can script the install!

    [4] Granted performance is a whole separate topic from OAT but there's no point signing off a system to go live if it's going to fall over 3 hours after launch because the developers have been working with a mere 10 records in every table. So a little prophylactic performance checking seems necessary before sign-off.

  • Here we have lots of servers and are always getting new apps.

    HOWEVER, they are mostly small and they are of little impact. And most people don't do the testing. We've tried to change that, but quite often the decision has been made far above me to move this into production, so we work with what we have and try to help things limp along.

    You've got a great list and perhaps we should setup some type of forum for this. Or if you get permission, maybe you want to write up a case study of what you did and we can then get debates going in a forum for that.

    Steve Jones

    sjones@sqlservercentral.com

    http://www.sqlservercentral.com/columnists/sjones

    http://www.dkranch.net

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply