Summary
Rebuilding a Windows Server 2008 R2 Cluster/SQL Server 2012 Availability Group during development, testing and production DR scenarios might involve destroying, recreating and reconfiguring the cluster and Availability Group. Performing these actions manually using Wizards will be slower and more error prone than using the same set of PowerShell (PoSh) scripts used during the original construction. The objective of this article is to highlight how much quicker and easier it is to use PowerShell after making the initial invest identifying the PoSh cmdlet's and syntax required.
A sequence of actions is required to rebuild a SQL Server Availability Group (AG), databases should be removed from the AG before the AG is deleted. The AG should be deleted before the Cluster is destroyed and so on. Following a scripted process will be less error prone than following a set of screen prints in operational documentation. Scripts don't get interrupted by colleagues or distracted by the arrival of the sandwich van outside your office. Scripts can be searched for specifics such as DNS names that would not be found in word documents containing screen prints and scripts will not mis-type information such as IP addresses.
The PoSh cmdlet's used in this article are as vanilla as possible, rather than declaring and assigning values to variables, values are hardcoded. Programming times were measured as faithfully as possible to provide a realistic measure of the break even point in terms of time invested in scripting. Adding error handling and so on would have increased the development times and unfairly skewed the breakeven point away from PoSh usage. My SQL Server and Windows Server skills are pretty good but I have only really started using PowerShell in the last few months., a few people will be slower developing cmdlet's similiar to those used in this article, many people will be much quicker.
Lab Architecture.
The test lab below was built using virtual machines in Hyper-V following the clustering and Availability Group suggestions in this whitepaper : http://sqlcat.com/sqlcat/b/whitepapers/archive/2012/06/22/alwayson-architecture-guide-building-a-high-availability-and-disaster-recovery-solution-by-using-failover-cluster-instances-and-availability-groups.aspx
- S3-SQL and S4-SQL are clustered with a file share witness on the domain controller
- Each cluster node hosts a non-clustered default instance of SQL Server 2012.
- The Availability Group and it's Listener are SQL and Cluster Service components.
- S1-SQL is used to execute all PowerShell scripts using the remote administration features.
Shaded objects in the AG_Lab diagram below are deleted and recreated by the PowerShell scripts at the end of this article.
Windows Cluster & SQL Server AG Rebuild - The Steps
I have categorized the rebuild into 7 discreet steps:
- Step 1 - Remove the database from AG - This also involves deleted the database on the standby
- Step 2 - Delete the AG - Disable 'Always On' and restarted SQL Services on primary and secondary
- Step 3 - Destroy the Cluster - Computer Objects related to the cluster should also be removed from Active Directory
- Step 4 - Create & Configure the Cluster - This includes configuring a file share for Quorum and validating the cluster
- Step 5 - Prepare Environment - This involves granting the cluster computer object permissions to create other computer objects (for the AG Listener) and enabling AG for the SQL Services which are also restarted.
- Step 6 - Create the AG - This involves backing up/restoring databases and creating an AG Listener databases
- Step 7 - Test Connectivity - This is a basic AG validation.
The steps above were performed manually, then Posh scripts were developed and executed, timings for each of these stages were recording at each step. Times are in seconds and approximations but are as faithful as possible.
Stage 1 - Manual Cluster Rebuild Times | Stage 2 - Command Line Times | ||
Script Development | Script Execution | ||
Step 1 - Remove DB from AG | 82 | 180 | 48 |
Step 2 - Delete AG | 40 | 120 | 25 |
Step 3 - Destroy Cluster | 145 | 180 | 23 |
Step 4 - Create & Configure Cluster | 220 | 180 | 117 |
Step 5 - Prepare Environment | 105 | 30 | 25 |
Step 6 - Create AG | 125 | 180 | 35 |
Step 7 - Test Connectivity | 30 | 60 | 10 |
Total - | 12.5 minutes | 15.5 minutes | 4 minutes |
- A full, manual rebuild of the cluster in my lab environment takes about 12.5 minutes. This will not change much as time is always wasted switching between machines, windows and waiting for wizards to prompt for values.
- The development of PoSh scripts to perform the rebuild automatically took about 15.5 minutes, this is where times will vary according to skill level and experience with PowerShell.
- To execute a full rebuild automatically using the scripts developed in Stage 2 takes about 4 minutes in my lan environment. Some operations such as creating clusters and restarting SQL Services were slow and times will vary according to hardware specifications.
Each execution of the rebuild using the scripts is saving about 8 minutes over the manual process so the 15 minutes spent developing the scripts is redeemed by the second rebuild.
Availability Group Rebuild - Manual versus Posh Automated
Let's look in more detail at the differences between manual and PoSh automated rebuilds.
Manual Step 1 - Remove the database from AG
This also involves deleting the database on the standby. SQL Server Management Studio was started and the database right clicked/removed from the AG. The database on the standby server then has to be deleted.
PoSh Step 1 - Remove the database from AG
This also involves deleted the database on the standby
# Copy modules from here : C:\Program Files (x86)\Microsoft SQL Server\110\Tools\PowerShell\Modules
# To here : C:\Users\administrator.BREWERP\Documents\WindowsPowerShell\Modules
# to import SQLPS module
#Set-ExecutionPolicy unrestricted
Import-Module SQLPS
# Remove AG DB on Primary, also remove DB from any secondary
Remove-SqlAvailabilityDatabase -Path SQLSERVER:\SQL\S4-SQL\DEFAULT\AvailabilityGroups\AGL_BrewerP\AvailabilityDatabases\db_workspace
# drop DB
$MyDB = Get-Item "SQLSERVER:\SQL\S3-SQL\Default\Databases\db_workspace"
$MyDB.Drop()
Manual Step 2 - Delete the AG
Disable 'Always On' and restarted SQL Services on primary and secondary. This starts by deleting the Availability Group.
Disable the Availability Group on both primary and secondary servers and restart the SQL Services.
PoSh Step 2 - Delete the AG
Disable 'Always On' and restarted SQL Services on primary and secondary
# Connect to the primary and delete AG, restart sql services
Remove-SqlAvailabilityGroup SQLSERVER:\SQL\S4-SQL\DEFAULT\AvailabilityGroups\AGL_BrewerP
Disable-SqlAlwaysOn -Path SQLSERVER:\SQL\S4-SQL\DEFAULT -NoServiceRestart
Disable-SqlAlwaysOn -Path SQLSERVER:\SQL\S3-SQL\DEFAULT -NoServiceRestart
Stop-Service -InputObject $(Get-Service -ComputerName S3-SQL.brewerp.local -Name MSSQLSERVER) -Force
Stop-Service -InputObject $(Get-Service -ComputerName S4-SQL.brewerp.local -Name MSSQLSERVER) -Force
Start-Service -InputObject $(Get-Service -ComputerName S3-SQL.brewerp.local -Name MSSQLSERVER)
Start-Service -InputObject $(Get-Service -ComputerName S4-SQL.brewerp.local -Name MSSQLSERVER)
Start-Service -InputObject $(Get-Service -ComputerName S3-SQL.brewerp.local -Name SQLSERVERAGENT)
Start-Service -InputObject $(Get-Service -ComputerName S4-SQL.brewerp.local -Name SQLSERVERAGENT)
Manual Step 3 - Destroy the Cluster
Computer Objects related to the cluster should also be removed from Active Directory.
Start the Windows Failover Cluster Manager and destroy the cluster
Delete Cluster and AG Listener computer objects from Active Directory
PoSh Step 3 - Destroy the Cluster
Computer Objects related to the cluster should also be removed from Active Directory
# Remote to a node where Cluster PoSh modules are available
Import-Module FailoverClusters
Remove-Cluster -Cluster "CL_BrewerP" -Force
# Install AD web service on domain controller if AD OS < 2008 R2
# http://www.microsoft.com/en-us/download/details.aspx?id=2852
# Makue sure (netstat -ano) port 9383 on AD Firewall is open
# Add remote Administration - AD PowerShell feature in windows to local machine
# Tidy old cluster computer account
Import-Module ActiveDirectory
Remove-ADComputer CL_BrewerP
Remove-ADComputer AGL_BrewerP
Manual Step 4 - Create & Configure the Cluster
This includes configuring a file share for Quorum and validating the cluster. First, create the cluster.
Configure Quorum settings
PoSh Step 4- Create & Configure the Cluster
This includes configuring a file share for Quorum and validating the cluster
#Create, configure and start a new cluster
Import-Module servermanager
Add-WindowsFeature RSAT-Clustering
Import-Module FailoverClusters
Test-Cluster -Node S3-SQL,S4-SQL -Ignore Inventory
New-Cluster -Name CL_BrewerP -Node S3-SQL, S4-SQL -StaticAddress 192.168.1.85
Start-Cluster -Name CL_BrewerP
Set-ClusterQuorum -Cluster CL_BrewerP -NodeAndFileShareMajority \\s1-sql\ClusterQuorum
Manual Step 5 - Prepare Environment
This involves granting the cluster computer object permissions to create other computer objects (for the AG Listener) and enabling AG for the SQL Services which are also restarted.
Enable Always On and restart SQL Service on primary and secondary servers.
Grant the Cluster computer object permissions to create other computer objects (AG Listener)
PoSh Step 5 - Prepare Environment
This involves granting the cluster computer object permissions to create other computer objects (for the AG Listener) and enabling AG for the SQL Services which are also restarted.
# Enable 'Availability Group' and restart SQL Services on S3-SQL & S4-SQL
Enable-SqlAlwaysOn -Path SQLSERVER:\SQL\S4-SQL\DEFAULT -NoServiceRestart
Enable-SqlAlwaysOn -Path SQLSERVER:\SQL\S3-SQL\DEFAULT -NoServiceRestart
Stop-Service -InputObject $(Get-Service -ComputerName S3-SQL.brewerp.local -Name MSSQLSERVER) -Force
Stop-Service -InputObject $(Get-Service -ComputerName S4-SQL.brewerp.local -Name MSSQLSERVER) -Force
Start-Service -InputObject $(Get-Service -ComputerName S3-SQL.brewerp.local -Name MSSQLSERVER)
Start-Service -InputObject $(Get-Service -ComputerName S4-SQL.brewerp.local -Name MSSQLSERVER)
Start-Service -InputObject $(Get-Service -ComputerName S3-SQL.brewerp.local -Name SQLSERVERAGENT)
Start-Service -InputObject $(Get-Service -ComputerName S4-SQL.brewerp.local -Name SQLSERVERAGENT)
# Grant cluster computer object the permissions needed to create an AG listener
#Import-Module ActiveDirectory
$Computer = Get-ADComputer -Filter 'SamAccountName -like "CL_*"'
Add-ADPrincipalGroupMembership -Identity $Computer -Memberof "Administrators"
Manual Step 6 - Create the AG
This involves backing up/restoring databases and creating an AG Listener databases
Create the AG Listener
PoSh Step 6 - Create the AG
This involves backing up/restoring databases and creating an AG Listener databases.
#Create Availability Group Cluster
#Import-Module SQLPS
$primaryServer = Get-Item "SQLSERVER:\SQL\S4-SQL\DEFAULT"
$secondaryServer = Get-Item "SQLSERVER:\SQL\S3-SQL\DEFAULT"
$primaryReplica = New-SqlAvailabilityReplica `
-Name "S4-SQL" `
-EndpointUrl "TCP://S4-SQL.brewerp.local:5022" `
-FailoverMode "Automatic" `
-AvailabilityMode "SynchronousCommit" `
-AsTemplate `
-Version ($primaryServer.Version)
# Create the initial secondary replica
$secondaryReplica = New-SqlAvailabilityReplica `
-Name "S3-SQL" `
-EndpointUrl "TCP://S3-SQL.brewerp.local:5022" `
-FailoverMode "Automatic" `
-AvailabilityMode "SynchronousCommit" `
-AsTemplate `
-Version ($secondaryServer.Version)
# Create the Availability Group.
New-SqlAvailabilityGroup `
-Name "AG_BrewerP" `
-Path "SQLSERVER:\SQL\S4-SQL\DEFAULT" `
-AvailabilityReplica @($primaryReplica,$secondaryReplica) `
-Database "db_workspace"
Join-SqlAvailabilityGroup -Path "SQLSERVER:\SQL\S3-SQL\DEFAULT" -Name "AG_BrewerP"
# Restore the database and log on the secondary (using NO RECOVERY)
# T-SQL Backup
# BACKUP DATABASE [db_workspace] TO DISK = N'\\S4-SQL\Backups\db_workspace_FULL.bak' WITH CHECKSUM, NO_COMPRESSION, INIT
# BACKUP LOG [db_workspace] TO DISK = N'\\S4-SQL\Backups\db_workspace_LOG.trn' WITH CHECKSUM, NO_COMPRESSION , INIT
# GO
# sp_RestoreScriptGenie @Database = 'db_workspace'Generates T-SQL resstore commands below
#;RESTORE DATABASE [db_workspace] FROM DISK = '\\S4-SQL\Backups\db_workspace_FULL.bak' WITH REPLACE, FILE = 1,CHECKSUM,NORECOVERY, STATS=10, MOVE 'db_workspace' TO 'x:\data\db_workspace.mdf', MOVE 'db_workspace_log' TO 'x:\data\db_workspace_log.ldf'
#, MOVE 'db_workspace_FG1' TO 'x:\data\db_workspace_FG1.ndf'
#, MOVE 'db_workspace_FG2' TO 'x:\data\db_workspace_FG2.ndf'
#;RESTORE LOG [db_workspace] FROM DISK = '\\S4-SQL\Backups\db_workspace_LOG.trn' WITH NORECOVERY, CHECKSUM,FILE = 1
Restore-SqlDatabase `
-Database "db_workspace" `
-BackupFile "\\S4-SQL\Backups\db_workspace_FULL.bak" `
-ServerInstance "S3-SQL"`
-NoRecovery
Restore-SqlDatabase `
-Database "db_workspace" `
-BackupFile "\\S4-SQL\Backups\db_workspace_LOG.trn" `
-ServerInstance "S3-SQL" `
-RestoreAction Log `
-NoRecovery
Add-SqlAvailabilityDatabase -Path "SQLSERVER:\SQL\S3-SQL\DEFAULT\AvailabilityGroups\AG_BrewerP" -Database "db_workspace"
# Create Availability Group Listener
New-SqlAvailabilityGroupListener -Name AGL_BrewerP `
-StaticIp '192.168.1.186/255.255.255.0' `
-Path SQLSERVER:\SQL\S4-SQL\DEFAULT\AvailabilityGroups\AG_BrewerP `
-Port 1433
Manual Step 7 - Test Connectivity
Test failover.
PoSh Step 7 - Test Connectivity
#Test Availability Group
#Import-Module SQLPS
Test-SqlAvailabilityGroup -Path SQLSERVER:\SQL\S4-SQL\DEFAULT\AvailabilityGroups\AG_BrewerP
Conclusion
In addition to the PoSh advantages suggested in the summary section of this article, two additional benefits of script development include:
- The case for automation becomes stronger the more it is done, as proficiency grows, script development times fall. Many scripts developed for one purpose are reusable in different scenario's which further improves productivity.
- PoSh can be used to manage SQL Server features and Windows Server features with various modules providing different cmdlet's for each. T-SQL scripts and procedures are confined to SQL Server so processes such as this cluster/AG rebuild that affect the OS are beyond it's capabilities.
At the start of this article I was slightly sceptical about the blanket statement 'the best DBA's automate everything', at the end of the it my opinion is different and I agree.
References
- http://msdn.microsoft.com/en-us/library/gg492181.aspx <- Create SQL Availability Groups using PowerShell
- Books - SQL Server 2008 Administration with Windows PowerShell - ISBN 978-0-470-47728-1
- Course - Automating Administration with Windows PowerShell 2.0 - Microsoft Official Course 10325A
- http://blogs.msdn.com/b/clustering/archive/2010/03/26/9985790.aspx
- https://www.simple-talk.com/sql/backup-and-recovery/backup-and-restore-sql-server-with-the-sql-server-2012-powershell-cmdlets/
- http://www.sqlservercentral.com/blogs/discussionofsqlserver/2013/02/11/a-month-of-powershell-day-11-smo-databases/#codesyntax_30
- http://technet.microsoft.com/en-gb/library/ff878259.aspx#PScmd2Procedure