Change Management
One of the many core tasks of the DBA is that of change control management. This article discusses the processes I use from day to day and follows the cycle of change from development, test then into production. The core topics will include:
a) formalising the process
b) script management
c) developer security privileges in extreme programming environments
d) going live with change
e) managing ad-hoc (hot fix) changes
Environment Overview
With any serious, mission critical applications development, we should always have three to five core environments in which the team is operating. They include:
a) development
a. rarely rebuilt, busy server in which the database reflects any number of change controls, some of which never get to test and others go all the way through.
b) test
a. refreshed from production of a regular basis and in sync with a "batch" of change controls that are going to production within a defined change control window.
b. ongoing user acceptance testing
c. database security privileges reflect what will (or is) in production
c) production support
a. mirror of production at a point in time for user testing and the testing of fixes or debugging of critical problems rather than working in production.
d) pre-production
a. mirror of production
b. used when "compiling code" into production and the final pre-testing of production changes
e) production
The cycle of change is shown in the diagram below through some of these servers:
We will discuss each element of the change window cycle throughout this article.
The whole change management system, be it in-house built or a third party product has seen a distinct shift to the whole CRM (relationship management) experience, tying in a variety of processes to form (where possible) this:
This ties in a variety of policy and procedures to provide end-to-end service delivery for the customer. The "IR database" shown in the previous diagram doesn't quite get meet all requirements, but covers resource planning, IR and task management, and subsequent change window planning. With good policy and practice, paper based processes to document server configuration and application components assist in other areas of the services delivery and maintenance framework.
Pre-change window resource meeting
Every fortnight the team leaders, DBA's and the development manager discuss new and existing work over the next two weeks. The existing team of 20 contract programmers works on a variety of tasks, from new development projects extending current application functionality (long term projects and mini projects) to standard bug (incident request) fixing and system enhancements. All of this is tracked in a small SQL Server database with an Access front end, known as the "IR (incident reporting)" system.
The system tracks all new developments (3 month max cycle), mini projects (5-10 days), long term projects (measured and managed in 3 month blocks) and other enhancements and system bugs. This forms the heart and sole of the team in terms of task management and task tracking. As such, it also drives the change control windows and what of the tasks will be rolled into production each week (we have a scheduled downtime of 2 hours each Wednesday for change controls).
The resource meeting identifies and deals with issues within the environments, tasks to be completed or nearing completion and the work schedule over the next two weeks. The Manager will not dictate the content of the change window but guide resourcing and task allocation issues. The team leaders and the development staff will allocate their tasks to a change control window with a simple incrementing number representing the next change window. This number and associated change information in the IR database is linked to a single report that the DBA will use to on Tuesday afternoon to "lock" the change control away and use it to prepare for a production rollout.
Visual Source Safe (VSS)
The key item underpinning any development project is source control software. There is a variety on the market but from all clients sites I have visited to date all use Microsoft VSS. Personally, I can't stand the product; with its outdated interface, lacking functionality and unintuitive design, its something most tend to put up with. Even so, a well managed and secured VSS database is critical to ongoing source management.
Consider these items when using VSS:
a) Spend time looking around for better change manage front-ends that leaver off the VSS API / automation object model, if possible, web-based applications that allow remote development is a handy feature.
b) Consider separate root project folders for each environment
a. $/
i. development
ii. test (unit test)
iii. production
c) Understand what labeling and pinning mean in detail, along with the process f sharing files and repining. These fundamentals are often ignored and people simply make complete separate copies for each working folder or worse still, have a single working folder for dev, test and production source code (ie. 1 copy of the source).
d) All developers should check in files before leaving for the day to ensure backups cover all project files
e) Take time to review the VSS security features and allocation permissions accordingly
f) If pinning, labeling, branching etc is all too complex, get back to basics with either three separate VSS databases covering off development, test and production source code, or three project folders. Either way the development staff needs to be disciplined in their approach to source control management.
g) Apply latest service packs
Managing Servers
There are not a lot of development teams that I have come across that have their own server administrators. It is also rare that the servers fall under any SOE or contractual agreement in terms of their ongoing administration on the LAN and responsibility of the IT department. As such, the DBA should take the lead and be responsible for all server activities where possible, covering:
a) server backups - including a basic 20 tape cycle (daily full backups) and associated audit log, try and get the tapes off site where possible and keep security in mind.
b) software installed - the DBA should log all installations and de-installations of software on the server. The process should be documented and proactively tracked. This is essential for the future rollout of application components in production and for server rebuilds.
c) licensing and terminal server administration
d) any changes to active-directory (where applicable)
e) user management and password expiration
f) administrator account access
On the Development and Test servers I allow Administrator access to simplify the whole process. Before going live, security is locked down on the application and its OS access to mimic production as best we can. If need be, we will contact the companies systems administrators to review work done and recommend changes.
In terms of server specifications, aim for these at a minimum:
a) RAID-1 or RAID-5 for all disks - I had 4 disks fail on my development and test servers over a one year period, these servers really take a beating at times and contractor downtime is an expensive business.
b) 1+ Gb RAM minimum with expansion to 4+Gb
c) Dual PIII 800Mhz CPU box as a minimum
Allowing administrative access to any server usually raises hairs of the back of peoples necks, but in a managed environment with strict adherence of responsibilities and procedure, this sort of flexibility with staff is appreciated and works well with the team.
Development Server
The DBA maintains a "database change control form", separate from the IR management system and any other change management documentation. The form includes the three core server environments (dev, test and prod) and associated areas for developers to sign in order for generated scripts from dev to make their way between server environments. This form is shown below:
In terms of security and database source management, the developers are fully aware of:
a) naming conventions for all stored procedures and views
b) the DBA is the only person to make any database change
c) database roles to be used by the application database components
d) DBO owns all objects and roles security will be verified and re-checked before code is prompted to test
e) Developers are responsible for utilising visual source safe for stored procedure and view management
f) the DBA manages and is responsible for all aspects of database auditing via triggers and their associated audit tables
g) production server administrators must be contacted when concerned with file security and associated proxy user accounts setup to run COM+ components, ftp access, and security shares and remove virtual directory connections via IIS used by the application.
h) strict NTFS security privileges
With this in mind, I am quite lenient with the server and database environment, giving the following privileges. Be aware that I am a change control nut and refuse to move any code into production unless the above is
adhered to and standard practices are met throughout the server change cycle. There are no exceptions.
a) Server
a. Administrator access is given via terminal services to manage any portion of the application
b. DBA is responsible for server backups to tape (including OS, file system objects applicable to the application and the
databases)
b) Database
a. ddl_admin access - to add, delete or alter stored procedures, views, user defined functions.
b. db_securityadmin access - to deny/revoke security as need be to their stored procedures and views.
No user has db_owner or sysadmin access.
Database changes are scripted and the scripts stored in visual source safe. The form is updated with the script and its run order or associated pre-post manual tasks to be performed. To generate the scripts, I am relatively lazy. I alter all structures via the diagrammer, generate the script, and alter accordingly to cover off issues with triggers or very large tables that can be better scripted. This method (with good naming conventions) is simple and relatively fail-safe, and may I say, very quick. All scripts are stored in VSS.
The database is refreshed on "quite" times from production. This may only be a data refreshed but when possible (based on the status of changes between servers), a full database replacement from a production database backup is done. The timeline varies, but on average a data refresh occurs every 3-5 months and a complete replacement every 8-12 months.
Test Server
The test server database configuration in relation to security, user accounts, OS privileges, database settings are close to production as we can get them. Even so, its difficult to mimic the environment in its entirety as many production systems include web farms, clusters, disk arrays etc that are too expensive to replicate in test.
Here the DBA will apply scripts generated from complete change control forms that alter database structure, namely tables, triggers, schema bound views, full-text indexing, user defined data types and changes in security. The developers will ask the DBA to move up stored procedures and views from development into test as need be to complete UAT (user acceptance testing).
The DBA will "refresh" the test server database on a regular basis from production. This tends to coincide with a production change control window rollout. On completion of the refresh, the DBA might need to re-apply database change control forms still "in test".
All scripts are sourced from VSS.
Production Support
The production server box is similar to that of test, but is controlled by the person who is packaging up the next production release of scripts and other source code ready for production. This server is used for:
a) production support - restoring the production database to it at a point in time and debugging critical application errors, or pre-running end of month/quarter jobs.
b) pre-production testing - final test before going live with code, especially handy when we have many DLL's with interdependencies and binary compatibilities issues.
All database privileges are locked down along with the server itself.
Production
The big question here is, "who has access to the production servers and databases?". Depending on your SLA's, this can be wide and varied, from all access to the development team via internally managed processes all the way to having no idea where the servers are let alone getting access to it. I will take the later approach with some mention of more stricter access management.
If the development team has access, it's typically under the guise of a network/server administration team that oversee all servers, their SOE configuration and network connectivity, OS/server security and more importantly, OS backups and virus scanning. From here, the environment is "handed over" to the apps team for application configuration, set-up, final testing and "go live".
In this scenario, a single person within the development team should manage change control in this environment. This tends to be the application architect or the DBA.
When rolling out changes into production:
a) webserver is shutdown
b) MSDTC is stopped
c) Crystal reports and other batch routines scheduled to run are closed and/or disabled during the upgrade
d) prepare staging area "c:\appbuild" to store incoming CC window files
e) backup all components being replaced, "c:\appatches\<system>\YYYYMMDD"
a. I tend to include entire virtual directories (even if only 2 files are being altered)
b. COM+ DLL's are exported and the DLL itself is also copied just in case the export is corrupt
f) full backup of the database is done if any scripts are being run
g) consider a system state backup and registry backup, emergency disks are a must and should always be kept up to date.
Take care with service packs of any software. The change (upgrade or downgrade) of MDAC, and the slight changes in system stored procedures and system catalogs with each SQL Server update can grind parts (or all) of your application to a halt.
Hot Fixes
Unless you are running a mission critical system, there will always be minor system bugs that result in hot fixes in production. The procedure is relatively simple but far from ideal in critical systems.
a) Warn all core users of the downtime, pre-empt with a summary of the errors being caused and how to differentiate the error from other system messages.
b) If possible, re-test the hot fix on the support server
c) Bring down the application in an orderly fashion (eg. web-server, component services, sql-agent, database etc).
d) Backup all core components being replaced/altered
Database hot fixes, namely statements rolling back the last change windows work is tricky. Do not plan to kick users off if possible, but at the same time, careful testing is critical to prevent having to do point in time recovery if this get bad to worse.
Finally, any hotfix should end with a 1/2 page summary of the reasons why the change was made, this is documented in the monthly production system report. Accountability is of key importance in any environment.
Smarten up your applications (Autonomic Computing)
Autonomic computing "is an approach to self-managed computing systems with a minimum of human interference" (IBM). In other words, self repairing, reporting, managing systems that look after the whole off the computing environment. So what has this got to do with change management? everything actually.
The whole change management process is about customers and the service we provide them as IT professionals. To assist in problem detection and ideally, resolution system architects of any application should consider either:
a) API for monitoring software to plug in error trapping/correct capability
b) Application consists of single entry point for all system messages (errors, warning, information) related to daily activity
c) The logging system is relatively fault tolerant itself, ie. if it cant write messages to a database it will try a file system or event log.
d) Where possible, pre-allocate range of codes with a knowledge base description, resolution and rollback scenario if appropriate. Take care that number allocates don't impose on sysmessages (and its ADO errors) and other OS related error codes as you don't want to skew the actual errors being returned.
A simplistic approach we have taken is shown below; its far self healing but meets some of the basic criteria so we can expand in the future:
MRAC Principal of IR/Task Completion
This is going off track a little in terms of change control but I felt its worth sharing with you. The MRAC (Mange, Resource, Approve, Complete) principal is a micro guide to task management for mini projects and incident requests spanning other teams/people over a short period of time. The idea here is the get developers who own the task to engage in basic project management procedures. This not only assists in documenting their desired outcome, but communicating this to others involved and engaging the resources required to see the entire task through to its completion.
The process is simple enough as shown in the table below. The development manager may request this at any time based on the IR's complexity. The developer is expected to meet with the appropriate resources and drive the task and its processes accordingly. This is not used in larger projects in which a skilled project manager will take control and responsibility of the process.
Task or deliverable
Planned Completion Date
Managed by
Resourced to
Approved by
Completed by
Requirements
Design
Build
Test
Implement
The tasks of course will vary, but rarely sway from the standard requirements, design, build, test, implement life-cycle. Some of the key definitions related to the process are as follows:
Managed
Each task or deliverable is managed by the person who is given the responsibility of ensuring that it is completed
Resourced
The person or persons who are to undertake a task or prepare a deliverable
Accepted
The recorded decision that a product or part of a product has satisfied the requirements and may be delivered to the Client or used in the next part of the process.
Approved
The recorded decision that the product or part of the product has satisfied the quality standards.
Authorised
The recorded decision that the record or product has been cleared for use or action.
Variation
A formal process for identifying changes to the Support Release or its deliverables and ensuring appropriate control over variations to the Support Release scope, budget and schedule. It may be associated with one or more Service Requests.
This simple but effective process allows developers and associated management to better track change and its interdependencies throughout its lifecycle.
Summary
No matter the procedures and policies in place, you still need commitment from development managers, project leaders/manager and the senior developers to drive the change management process. Accountability and strict adherence to the defined processes is critical to avoid the nightmare of any project, that being a source code versions that we can never re-create, or a production environment in which we don't have the source for.
Failure to lay down the law with development staff (including the DBA) is a task easily put in the too hard basket. It is not easy, but you need to start somewhere. This article has presented a variety of ideas on the topic that may prompt you to take further action in this realm.
The 21st century DBA, aka Technical Consultant, needs to focus on a variety of skills, not only database change but change management processes as a complete picture.