SQLServerCentral Article

Set up a Windows Server Fail-over Clusters (As a Precursor to High Availability in Standard Edition)

,

In this walkthrough, we will be setting up WSFC (Windows Server Failover Cluster) in Windows Server. An WSFC is a group of Windows servers that work together to ensure high availability of applications and services. Any WSFC (albeit for FCI or AG) comprises the following components:

ComponentDefinition
NodeAny server that participates (preferable to have an odd number of nodes, as one will perform the role of tiebreaker).
Cluster ResourceA physical or logical entity that can be owned by a node, brought online and taken offline, moved between nodes, and managed as a cluster object. A cluster resource can be owned by only a single node at any point in time.
RoleA collection of cluster resources managed as a single cluster object to provide specific functionality
Network name resourceA logical server name that is managed as a cluster resource. A network name resource must be used with an IP address resource. These entries may require objects in Active Directory Domain Services and/or DNS.
Resource dependencyA resource on which another resource depends. If resource A depends on resource B, then B is a dependency of A. Resource A will not be able to start without resource B.
Preferred ownerA node on which a resource group prefers to run. Each resource group is associated with a list of preferred owners sorted in order of preference. During automatic failover, the resource group is moved to the next preferred node in the preferred owner list.
Possible ownerA secondary node on which a resource can run. Each resource group is associated with a list of possible owners. Roles can fail over only to nodes that are listed as possible owners.
Quorum modeThe quorum configuration in a failover cluster that determines the number of node failures that the cluster can sustain.
Force quorumThe process to start the cluster even though only a minority of the elements that are required for quorum are in communication

The WSFC provides infrastructure features that support the high-availability and disaster recovery scenarios of hosted server applications.

The nodes in a WSFC work together to collectively provide these types of capabilities:

  • Distributed metadata and notifications
  • Resource management
  • Health monitoring
  • Failover coordination

List of acronyms

AcronymDescription
WSFCWindows Server Failover Cluster (A group of connected and interdependent servers used for reliability and availability of an environment)
HAHigh Availability
AGAvailability Groups
BAGBasic Availability Groups
HADRHigh Availability and Disaster Recovery
FCIFailover Cluster Instances
BADRBasic Availability and Disaster Recovery
ADActive Directory
DSDomain Services
OUOrganisational Unit
CAUCluster-Aware Updating

Prerequisites

Here are a few things to check before beginning the process of creating the WSFC.

  • Make sure that all servers that you want to add as cluster nodes are running the same version of Windows Server.
  • Review the hardware requirements to make sure that your configuration is supported.
  • Make sure that all servers that you want to add as cluster nodes are joined to the same Active Directory domain.
  • Optionally, create an organizational unit (OU) and move the computer accounts for the servers that you want to add as cluster nodes into the OU. As a best practice, we recommend that you place failover clusters in their own OU in AD DS. This can help you better control which Group Policy settings or security template settings affect the cluster nodes. By isolating clusters in their own OU, it also helps prevent against accidental deletion of cluster computer objects.

Additionally, verify the following account requirements:

  • Make sure that the account you want to use to create the cluster is a domain user who has administrator rights on all servers that you want to add as cluster nodes.
  • Make sure that either of the following is true:
    • The user who creates the cluster has the Create Computer objects permission to the OU or the container where the servers that will form the cluster reside.
    • If the user does not have the Create Computer objects permission, ask a domain administrator to prestage a cluster computer object for the cluster.

Now we need to do the actual Clustering work

To showcase this, I have the following setup in my environment:

  • Host: Windows Server 2022 Standard with Hyper-V
  • Virtuals:
    • Active Directory Services. Domain Controller and DNS Services (In the real world, this would be separate servers.)
    • DB Server 1
    • DB Server 2
    • File Server (this will be the tiebreaker instance)
List of servers in HyperV Console List of Servers in Hyper-V Manager

Step 1: Install Failover Cluster feature on all cluster servers

Add the feature to your Windows Server install using the Server Manager. Do this on  DBServ01, DBServ02, and FileServ.

Installing Failover Clusters on all nodes How to Install Failover Clusters on all nodes

 

Step 2: Validate the configuration

Before you create the failover cluster, we strongly recommend that you validate the configuration to make sure that the hardware and hardware settings are compatible with failover clustering. Microsoft supports a cluster solution only if the complete configuration passes all validation tests and if all hardware is certified for the version of Windows Server that the cluster nodes are running.

The above demo environments do meet this, so we continue.

Step 3: Run cluster validation tests

On a computer that has the Failover Cluster Management Tools installed from the Remote Server Administration Tools or on a server where you installed the Failover Clustering feature, start Failover Cluster Manager. To do this on a server, start Server Manager and then on the Tools menu, select Failover Cluster Manager.

Starting Cluster Validation tests Failover Cluster Manager

In the Failover Cluster Manager pane under Management, select Validate Configuration.

Begining the Cluster Validation Test Wizard The first page of the Cluster Validation Test Wizard

On the Before You Begin page, select Next.

On the Select Servers or a Cluster page in the Enter name box, enter the NetBIOS name or the fully qualified domain name of a server that you plan to add as a failover cluster node and then select Add. Repeat this step for each server that you want to add. To add multiple servers at the same time, separate the names by a comma or by a semicolon. For example, enter the names in the format server1.contoso.com, server2.contoso.com. When you are finished, select Next

Select Servers to form part of the Cluster Enter the full list of Servers that will form part of this Cluster

On the Testing Options page, select Run all tests (recommended) and then select Next.

Validate the scope of the tests If the recommended action (i.e. Run all tests) has been selected, the Confirmation page will reflect the full scope

On the Confirmation page, select Next. The Validating page displays the status of the running tests.

The validation screen shows the progress of tests and their status The validation screen shows the progress of tests and their status

On the Summary page, do either of the following:

If the results indicate that the tests completed successfully and the configuration is suited for clustering and you want to create the cluster immediately, make sure that the Create the cluster now using the validated nodes check box is selected and then select Finish. Then, continue to step 4 of the Create the failover cluster procedure.

Once validation has completed, the Summary screen will show the outcomes of the test. Once validation has completed, the Summary screen will show the outcomes of the test. All Tests should be Successful (There might be a warning or two)

If the results indicate that there were warnings or failures, select View Report to view the details and determine which issues must be corrected. Realize that a warning for a particular validation test indicates that this aspect of the failover cluster can be supported but might not meet the recommended best practices.

Step 4: Create the failover cluster

To complete this step, make sure that the user account that you log on as meets the requirements that are outlined in the Verify the prerequisites section of this topic.

Start Server Manager. On the Tools menu, select Failover Cluster Manager. In the Failover Cluster Manager pane, under Management, select Create Cluster. The Create Cluster Wizard opens.

The Administering Access Point for Administering the Cluster screen The Administering Access Point for Administering the Cluster screen. Here is where you will assign the Name and IP Address to be associated with the Cluster

On the Before You Begin page, select Next. If the Select Servers page appears, in the Enter name box enter the NetBIOS name or the fully qualified domain name of a server that you plan to add as a failover cluster node and then select Add. Repeat this step for each server that you want to add. To add multiple servers at the same time, separate the names by a comma or a semicolon. For example, enter the names in the format server1.contoso.com; server2.contoso.com. When you are finished, select Next.

 Note: If you chose to create the cluster immediately after running validation in the configuration validating procedure, you will not see the Select Servers page. The nodes that were validated are automatically added to the Create Cluster Wizard so that you do not have to enter them again.

If you skipped validation earlier, the Validation Warning page appears. We strongly recommend that you run cluster validation. Only clusters that pass all validation tests are supported by Microsoft. To run the validation tests, select Yes, and then select Next. Complete the Validate a Configuration Wizard as described in Validate the configuration.

On the Access Point for Administering the Cluster page, do the following:

In the Cluster Name box, enter the name that you want to use to administer the cluster. Before you do, review the following information:

    1. During cluster creation, this name is registered as the cluster computer object (also known as the cluster name object or CNO) in AD DS. If you specify a NetBIOS name for the cluster, the CNO is created in the same location where the computer objects for the cluster nodes reside. This can be either the default Computers container or an OU.
    2. To specify a different location for the CNO, you can enter the distinguished name of an OU in the Cluster Name box. For example: CN=ClusterName, OU=Clusters, DC=Contoso, DC=com.
    3. If a domain administrator has pre-staged the CNO in a different OU than where the cluster nodes reside, specify the distinguished name that the domain administrator provides.

If the server does not have a network adapter that is configured to use DHCP, you must configure one or more static IP addresses for the failover cluster. Select the check box next to each network that you want to use for cluster management. Select the Address field next to a selected network and then enter the IP address that you want to assign to the cluster. This IP address (or addresses) will be associated with the cluster name in Domain Name System (DNS).

The Administering Access Point for Administering the Cluster screen The Administering Access Point for Administering the Cluster screen

When you are finished, select Next.

On the Confirmation page, review the settings. By default, the Add all eligible storage to the cluster check box is selected. Clear this check box if you want to do either of the following:

  • You want to configure storage later.
  • You plan to create clustered storage spaces through Failover Cluster Manager or through the Failover Clustering Windows PowerShell cmdlets and have not yet created storage spaces in File and Storage Services. For more information, see Deploy Clustered Storage Spaces.
Request for Confirmation to create the Cluster This is the Request for Confirmation screen. Make sure all entries are correct here

Select Next to create the failover cluster.

Summary of Cluster Creation screen Once the Cluster has been created, this screen will provide a detailed summary of all work done

On the Summary page, confirm that the failover cluster was successfully created. If there were any warnings or errors, view the summary output or select View Report to view the full report. Select Finish.

To confirm that the cluster was created, verify that the cluster name is listed under Failover Cluster Manager in the navigation tree. You can expand the cluster name and then select items under NodesStorage or Networks to view the associated resources.

Understand that it may take some time for the cluster name to successfully replicate in DNS. After successful DNS registration and replication, if you select All Servers in Server Manager, the cluster name should be listed as a server with a Manageability status of Online.

After the cluster is created, you can do things such as verify cluster quorum configuration, and optionally, create Cluster Shared Volumes (CSV). You will see the cluster in your Active Directory Users and Computers.

Active Directory reflecting the newly created Cluster Active Directory Computers Screen showing the newly created Cluster as a Virtual Network Node

You should also see the cluster in the Failover Cluster Manager.

Failover Cluster Manager reflecting the new Cluster The Failover Network Manager will also reflect the newly created Cluster

Step 5: Create a Witness

This sections shows how to  create a witness.

In the Failover Cluster Manager, locate the new cluster resource, right click, More Actions, Configure Cluster Quorum Settings.

First page of Quorum Configuration Wizard The Before You Begin page of the Configure Quorum Wizard

Choose “Select the quorum witness”

Select Quorum Configuration Option The Select Quorum Configuration Option page. This is where you will inform the Wizard which option you will be implementing

Here are various options. Choose what is right for your setup. In my lab I chose Configure file share witness

Configure File Share Witness screen Based on the option selected in the previous screen, here you will select the Quorum Witness

Choose configure share witness and enter the file share path.

Configure File Share Witness screen The Configure File Share Witness is where you will insert the shared path to the file share location

Next the confirmation screen. Verify things are correct.

Confirmation ofall the options selected The Confirmation screen is used to check all settings are correct before initiating

If all is in order, press Next.

Cluster Manager now shows the Withness The Failover Cluster Manager now shows the details of the Witness

If all works according to plan, the cluster quorum witness will be created and added to the list of objects.

Final Outcomes

The local server properties now reflect the following for DBServ01.

Local Server Properties page reflecting the details of the Cluster The Local Server Properties page will now reflect the details of the Cluster

Here is DBServ02.

And the Local Server Properties for DBServer02 And the Local Server Properties for DBServer02

Here is the FileServ.

And the Local Server Properties for FileServ And the Local Server Properties for FileServ

Now we have a fully witnessed WSFC set up, any additional setup (e.g. FCI, HADR, BADR) is much simpler.

One last thing – Setting up Cluster-Aware Updating

Now that WSFC has been set up, we need to carefully consider how to apply updates. Do not just update haphazardly. Care needs to be exercised. To this end, once the Clustering Management tool has been installed, Cluster-Aware Updating should be applied.

Feature description

Cluster-Aware Updating is an automated feature that enables you to update servers in a failover cluster with little or no loss in availability during the update process. During an Updating Run, Cluster-Aware Updating transparently performs the following tasks:

  1. Puts each node of the cluster into node maintenance mode.
  2. Moves the clustered roles off the node.
  3. Installs the updates and any dependent updates.
  4. Performs a restart if necessary.
  5. Brings the node out of maintenance mode.
  6. Restores the clustered roles on the node.
  7. Moves to update the next node.

For many clustered roles in the cluster, the automatic update process triggers a planned failover. This can cause a transient service interruption for connected clients. However, in the case of continuously available workloads, such as Hyper-V with live migration or file server with SMB Transparent Failover, Cluster-Aware Updating can coordinate cluster updates with no impact to the service availability.

Practical applications

  • CAU reduces service outages in clustered services, reduces the need for manual updating workarounds, and makes the end-to-end cluster updating process more reliable for the administrator. When the CAU feature is used in conjunction with continuously available cluster workloads, such as continuously available file servers (file server workload with SMB Transparent Failover) or Hyper-V, the cluster updates can be performed with zero impact to service availability for clients.
  • CAU facilitates the adoption of consistent IT processes across the enterprise. Updating Run Profiles can be created for different classes of failover clusters and then managed centrally on a file share to ensure that CAU deployments throughout the IT organization apply updates consistently, even if the clusters are managed by different lines-of-business or administrators.
  • CAU can schedule Updating Runs on regular daily, weekly, or monthly intervals to help coordinate cluster updates with other IT management processes.
  • CAU provides an extensible architecture to update the cluster software inventory in a cluster-aware fashion. This can be used by publishers to coordinate the installation of software updates that are not published to Windows Update or Microsoft Update or that are not available from Microsoft, for example, updates for non-Microsoft device drivers.
  • CAU self-updating mode enables a "cluster in a box" appliance (a set of clustered physical machines, typically packaged in one chassis) to update itself. Typically, such appliances are deployed in branch offices with minimal local IT support to manage the clusters. Self-updating mode offers great value in these deployment scenarios.

Important functionality

The following is a description of important Cluster-Aware Updating functionality:

  • A user interface (UI) - the Cluster Aware Updating window - and a set of cmdlets that you can use to preview, apply, monitor, and report on the updates
  • An end-to-end automation of the cluster-updating operation (an Updating Run), orchestrated by one or more Update Coordinator computers
  • A default plug-in that integrates with the existing Windows Update Agent (WUA) and Windows Server Update Services (WSUS) infrastructure in Windows Server to apply important Microsoft updates
  • A second plug-in that can be used to apply Microsoft hotfixes, and that can be customized to apply non-Microsoft updates
  • Updating Run Profiles that you configure with settings for Updating Run options, such as the maximum number of times that the update will be retried per node. Updating Run Profiles enable you to rapidly reuse the same settings across Updating Runs and easily share the update settings with other failover clusters.
  • An extensible architecture that supports new plug-in development to coordinate other node-updating tools across the cluster, such as custom software installers, BIOS updating tools, and network adapter or host bus adapter (HBA) updating tools.

Cluster-Aware Updating can coordinate the complete cluster updating operation in two modes:

  • Self-updating mode For this mode, the CAU clustered role is configured as a workload on the failover cluster that is to be updated, and an associated update schedule is defined. The cluster updates itself at scheduled times by using a default or custom Updating Run profile. During the Updating Run, the CAU Update Coordinator process starts on the node that currently owns the CAU clustered role, and the process sequentially performs updates on each cluster node. To update the current cluster node, the CAU clustered role fails over to another cluster node, and a new Update Coordinator process on that node assumes control of the Updating Run. In self-updating mode, CAU can update the failover cluster by using a fully automated, end-to-end updating process. An administrator can also trigger updates on-demand in this mode, or simply use the remote-updating approach if desired. In self-updating mode, an administrator can get summary information about an Updating Run in progress by connecting to the cluster and running the Get-CauRun Windows PowerShell cmdlet.
  • Remote-updating mode For this mode, a remote computer, which is called an Update Coordinator, is configured with the CAU tools. The Update Coordinator is not a member of the cluster that is updated during the Updating Run. From the remote computer, the administrator triggers an on-demand Updating Run by using a default or custom Updating Run profile. Remote-updating mode is useful for monitoring real-time progress during the Updating Run, and for clusters that are running on Server Core installations.

Configure the nodes for remote management

To use Cluster-Aware Updating, all nodes of the cluster must be configured for remote management. By default, the only task you must perform to configure the nodes for remote management is to Enable a firewall rule to allow automatic restarts. The following table lists the complete remote management requirements, in case your environment diverges from the defaults. These requirements are in addition to the installation requirements for the Install the Failover Clustering feature and the Failover Clustering Tools and the general clustering requirements that are described in previous sections in this topic.

RequirementDefault stateSelf-updating modeRemote-updating mode
Enable a firewall rule to allow automatic restartsDisabledRequired on all cluster nodes if a firewall is in useRequired on all cluster nodes if a firewall is in use
Enable Windows Management InstrumentationEnabledRequired on all cluster nodesRequired on all cluster nodes
Enable Windows PowerShell 3.0 or 4.0 and Windows PowerShell remotingEnabledRequired on all cluster nodesRequired on all cluster nodes to run the following:

- The Save-CauDebugTrace cmdlet

- PowerShell pre-update and post-update scripts during an Updating Run

- Tests of cluster updating readiness using the Cluster-Aware Updating window or the Test-CauSetup Windows PowerShell cmdlet

Install .NET Framework 4.6 or 4.5EnabledRequired on all cluster nodesRequired on all cluster nodes to run the following:

- The Save-CauDebugTrace cmdlet

- PowerShell pre-update and post-update scripts during an Updating Run

- Tests of cluster updating readiness using the Cluster-Aware Updating window or the Test-CauSetup Windows PowerShell cmdlet

Enable a firewall rule to allow automatic restarts

To allow automatic restarts after updates are applied (if the installation of an update requires a restart), if Windows Firewall or a non-Microsoft firewall is in use on the cluster nodes, a firewall rule must be enabled on each node that allows the following traffic:

  • Protocol: TCP
  • Direction: inbound
  • Program: wininit.exe
  • Ports: RPC Dynamic Ports
  • Profile: Domain

If Windows Firewall is used on the cluster nodes, you can do this by enabling the Remote Shutdown Windows Firewall rule group on each cluster node. When you use the Cluster-Aware Updating window to apply updates and to configure self-updating options, the Remote Shutdown Windows Firewall rule group is automatically enabled on each cluster node.

Note

The Remote Shutdown Windows Firewall rule group cannot be enabled when it will conflict with Group Policy settings that are configured for Windows Firewall. The Remote Shutdown firewall rule group is also enabled by specifying the –EnableFirewallRules parameter when running the following CAU cmdlets: Add-CauClusterRole, Invoke-CauRun, and SetCauClusterRole.

The following PowerShell example shows an additional method to enable automatic restarts on a cluster node.

Set-NetFirewallRule -Group "@firewallapi.dll,-36751" -Profile Domain -Enabled true

Enable Windows Management Instrumentation (WMI)

All cluster nodes must be configured for remote management using Windows Management Instrumentation (WMI). This is enabled by default.

To manually enable remote management, do the following:

  • In the Services console, start the Windows Remote Management service and set the startup type to Automatic.
  • Run the Set-WSManQuickConfig cmdlet, or run the following command from an elevated command prompt:
winrm quickconfig -q

To support WMI remoting, if Windows Firewall is in use on the cluster nodes, the inbound firewall rule for Windows Remote Management (HTTP-In) must be enabled on each node. By default, this rule is enabled.

Enable Windows PowerShell and Windows PowerShell remoting

To enable self-updating mode and certain CAU features in remote-updating mode, PowerShell must be installed and enabled to run remote commands on all cluster nodes. By default, PowerShell is installed and enabled for remoting. To enable PowerShell remoting, use one of the following methods:

  • Run the Enable-PSRemoting cmdlet.
  • Configure a domain-level Group Policy setting for Windows Remote Management (WinRM).

For more information about enabling PowerShell remoting, see About Remote Requirements.

Install .NET Framework 4.6 or 4.5

To enable self-updating mode and certain CAU features in remote-updating mode,.NET Framework 4.6, or .NET Framework 4.5 (on Windows Server 2012 R2) must be installed on all cluster nodes. By default, NET Framework is installed.

To install .NET Framework 4.6 (or 4.5) using PowerShell if it's not already installed, use the following command:

Install-WindowsFeature -Name NET-Framework-45-Core

Best practices recommendations for using Cluster-Aware Updating

We recommend that when you begin to use CAU to apply updates with the default Microsoft.WindowsUpdatePlugin plug-in on a cluster, you stop using other methods to install software updates from Microsoft on the cluster nodes.

Caution

Combining CAU with methods that update individual nodes automatically (on a fixed time schedule) can cause unpredictable results, including interruptions in service and unplanned downtime. We recommend that you follow these guidelines: for optimal results, we recommend that you disable settings on the cluster nodes for automatic updating, for example, through the Automatic Updates settings in Control Panel, or in settings that are configured using Group Policy.

Caution

Automatic installation of updates on the cluster nodes can interfere with installation of updates by CAU and can cause CAU failures. If they are needed, the following Automatic Updates settings are compatible with CAU, because the administrator can control the timing of update installation:

  • Settings to notify before downloading updates and to notify before installation
  • Settings to automatically download updates and to notify before installation

However, if Automatic Updates is downloading updates at the same time as a CAU Updating Run, the Updating Run might take longer to complete.

Do not configure an update system such as Windows Server Update Services (WSUS) to apply updates automatically (on a fixed time schedule) to cluster nodes. All cluster nodes should be uniformly configured to use the same update source, for example, a WSUS server, Windows Update, or Microsoft Update.

If you use a configuration management system to apply software updates to computers on the network, exclude cluster nodes from all required or automatic updates. Examples of configuration management systems include Microsoft Endpoint Configuration Manager and Microsoft System Center Virtual Machine Manager 2008.

If internal software distribution servers (for example, WSUS servers) are used to contain and deploy the updates, ensure that those servers correctly identify the approved updates for the cluster nodes.

 

Rate

5 (1)

You rated this post out of 5. Change rating

Share

Share

Rate

5 (1)

You rated this post out of 5. Change rating