SQL Server Distributed Availability Groups and Kubernetes

A while back I wrote about how to use a Cross Platform (or Clusterless) Availability Group to seed a database from a Windows SQL instance into a pod in Kubernetes.

I was talking with a colleague last week and they asked, “What if the existing Windows instance is already in an Availability Group?”

This is a fair question, as it’s fairly rare (in my experience) to run a standalone SQL instance in production…most instances are in some form of HA setup, be it a Failover Cluster Instance or an Availability Group.

Failover Cluster Instances will work with a clusterless Availability Group but it’s a different story when it comes to existing Availability Groups.

A Linux node cannot be added to an existing Windows Availability Group (trust me, I tried for longer than I’m going to admit) so the only way to do it is to use a Distributed Availability Group.

So let’s run through the process!

Here is the existing Windows Availability Group: –

Just a standard, 2 node AG with one database already synchronized across the nodes. It’s that database we are going to seed over to the pod running on the Kubernetes cluster using a Distributed Availability Group.

So here’s the Kubernetes cluster: –

kubectl get nodes

Four nodes, one control plane node and three worker nodes.

OK, so first thing to do is deploy a statefulset running one SQL Server pod (using a file called sqlserver-statefulset.yaml): –

kubectl apply -f .sqlserver-statefulset.yaml

Here’s the manifest of the statefulset: –

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mssql-statefulset
spec:
  serviceName: "mssql"
  replicas: 1
  podManagementPolicy: Parallel
  selector:
    matchLabels:
      name: mssql-pod
  template:
    metadata:
      labels:
        name: mssql-pod
      annotations:
        stork.libopenstorage.org/disableHyperconvergence: "true"
    spec:
      securityContext:
        fsGroup: 10001
      hostAliases:
      - ip: "10.225.115.129"
        hostnames:
        - "z-ap-sql-10"
      containers:
        - name: mssql-container
          image: mcr.microsoft.com/mssql/server:2022-CU15-ubuntu-20.04
          ports:
            - containerPort: 1433
              name: mssql-port
          env:
            - name: MSSQL_PID
              value: "Developer"
            - name: ACCEPT_EULA
              value: "Y"
            - name: MSSQL_AGENT_ENABLED
              value: "1"
            - name: MSSQL_ENABLE_HADR
              value: "1"
            - name: MSSQL_SA_PASSWORD
              value: "Testing1122"
          volumeMounts:
            - name: sqlsystem
              mountPath: /var/opt/mssql
            - name: sqldata
              mountPath: /var/opt/sqlserver/data
  volumeClaimTemplates:
    - metadata:
        name: sqlsystem
      spec:
        accessModes:
         - ReadWriteOnce
        resources:
          requests:
            storage: 1Gi
        storageClassName: mssql-sc
    - metadata:
        name: sqldata
      spec:
        accessModes:
         - ReadWriteOnce
        resources:
          requests:
            storage: 25Gi
        storageClassName: mssql-sc

Like my last post, this is pretty stripped down. No resources limits, tolerations etc. It has two persistent volumes, one for the system databases and one for the user databases from a storage class already configured in the cluster.

One thing to note: –

hostAliases:
- ip: "10.225.115.129"
  hostnames:
  - "z-ap-sql-10"

Here an entry in the pod’s hosts file is being created for the listener of the Windows Availability Group.

Next thing to do is deploy two services, one so that we can connect to the SQL instance (on port 1433) and one for the AG (port 5022): –

kubectl apply -f .sqlserver-services.yaml

Here’s the manifest for the services: –

apiVersion: v1
kind: Service
metadata:
  name: mssql-service
spec:
  ports:
  - name: mssql-ports
    port: 1433
    targetPort: 1433
  selector:
    name: mssql-pod
  type: LoadBalancer
---
apiVersion: v1
kind: Service
metadata:
  name: mssql-ha-service
spec:
  ports:
  - name: mssql-ha-ports
    port: 5022
    targetPort: 5022
  selector:
    name: mssql-pod
  type: LoadBalancer

NOTE – We could use just one service with multiple ports configured but I’m keeping them separate here to try and keep things as clear as possible.

Check that everything looks OK: –

kubectl get all

Now we need to create master key, login, and user in all instances: –

CREATE MASTER KEY ENCRYPTION BY PASSWORD = '<C0m9L3xP@55w0rd!>';
CREATE LOGIN [dbm_login] WITH PASSWORD = '<C0m9L3xP@55w0rd!>';
CREATE USER dbm_user FOR LOGIN dbm_login;

Then create a certificate in the SQL instance in the pod: –

CREATE CERTIFICATE dbm_certificate WITH SUBJECT = 'Mirroring_certificate', EXPIRY_DATE = '20301031'

Backup that certificate: –

BACKUP CERTIFICATE dbm_certificate
TO FILE = '/var/opt/mssql/data/dbm_certificate.cer'
WITH PRIVATE KEY (
        FILE = '/var/opt/mssql/data/dbm_certificate.pvk',
        ENCRYPTION BY PASSWORD = '<C0m9L3xP@55w0rd!>'
    );

Copy the certificate locally: –

kubectl cp mssql-statefulset-0:var/opt/mssql/data/dbm_certificate.cer ./dbm_certificate.cer -n prod
kubectl cp mssql-statefulset-0:var/opt/mssql/data/dbm_certificate.pvk ./dbm_certificate.pvk -n prod

And then copy the files to the Windows boxes: –

Copy-Item dbm_certificate.cer \z-ap-sql-02E$SQLBackup1 -Force
Copy-Item dbm_certificate.pvk \z-ap-sql-02E$SQLBackup1 -Force
Copy-Item dbm_certificate.cer \z-ap-sql-03E$SQLBackup1 -Force
Copy-Item dbm_certificate.pvk \z-ap-sql-03E$SQLBackup1 -Force

Once the files are on the Windows boxes, we can create the certificate in each Windows SQL instance: –

CREATE CERTIFICATE dbm_certificate
    AUTHORIZATION dbm_user
    FROM FILE = 'E:SQLBackup1dbm_certificate.cer'
    WITH PRIVATE KEY (
    FILE = 'E:SQLBackup1dbm_certificate.pvk',
    DECRYPTION BY PASSWORD = ''
)

OK, great! Now we need to create a mirroring endpoint in the SQL instance in the pod: –

CREATE ENDPOINT [Hadr_endpoint]
    STATE = STARTED
    AS TCP (   
        LISTENER_PORT = 5022,
        LISTENER_IP = ALL)
    FOR DATA_MIRRORING (
        ROLE = ALL,
        AUTHENTICATION = WINDOWS CERTIFICATE [dbm_certificate],
        ENCRYPTION = REQUIRED ALGORITHM AES
        );
ALTER ENDPOINT [Hadr_endpoint] STATE = STARTED;
GRANT CONNECT ON ENDPOINT::[Hadr_endpoint] TO [dbm_login];

There are already endpoints in the Windows instances, but we need to update them to use the certificate for authentication: –

ALTER ENDPOINT [Hadr_endpoint]
    STATE = STARTED
    AS TCP (   
        LISTENER_PORT = 5022,
        LISTENER_IP = ALL)
    FOR DATABASE_MIRRORING (
    AUTHENTICATION = WINDOWS CERTIFICATE [dbm_certificate]
    , ENCRYPTION = REQUIRED ALGORITHM AES
    );
GRANT CONNECT ON ENDPOINT::[Hadr_endpoint] TO [dbm_login];

Now we can create a one node Clusterless Availability Group in the SQL instance in the pod: –

CREATE AVAILABILITY GROUP [AG2]
   WITH (CLUSTER_TYPE=NONE)
   FOR
   REPLICA ON   
      'mssql-statefulset-0' WITH   
         (  
         ENDPOINT_URL = 'TCP://mssql-statefulset-0.com:5022',   
 FAILOVER_MODE = MANUAL
,AVAILABILITY_MODE = SYNCHRONOUS_COMMIT
,BACKUP_PRIORITY = 50
,SEEDING_MODE = AUTOMATIC
,SECONDARY_ROLE(ALLOW_CONNECTIONS = NO)
)

No listener here, we are going to use the mssql-ha-service as the endpoint for the Distributed Availability Group.

OK, so on the primary node of the Windows Availability Group, we can create the Distributed Availability Group: –

CREATE AVAILABILITY GROUP [DistributedAG]  
   WITH (DISTRIBUTED)   
   AVAILABILITY GROUP ON  
      'AG1' WITH    
      (   
         LISTENER_URL = 'tcp://Z-AP-SQL-10:5022',    
         AVAILABILITY_MODE = ASYNCHRONOUS_COMMIT,   
         FAILOVER_MODE = MANUAL,   
         SEEDING_MODE = AUTOMATIC   
      ),   
      'AG2' WITH    
      (   
         LISTENER_URL = 'tcp://10.225.115.131:5022',   
         AVAILABILITY_MODE = ASYNCHRONOUS_COMMIT,   
         FAILOVER_MODE = MANUAL,   
         SEEDING_MODE = AUTOMATIC   
      );

We could use a host file entry for the URL in AG2 (I did that in the previous post) but here we’ll just use the IP address of the mssql-ha-service.

OK, nearly there! We now have to join the Availability Group in the SQL instance in the pod: –

ALTER AVAILABILITY GROUP [DistributedAG]  
   JOIN   
   AVAILABILITY GROUP ON  
      'AG1' WITH    
      (   
         LISTENER_URL = 'tcp://Z-AP-SQL-10:5022',    
         AVAILABILITY_MODE = ASYNCHRONOUS_COMMIT,   
         FAILOVER_MODE = MANUAL,   
         SEEDING_MODE = AUTOMATIC   
      ),   
      'AG2' WITH    
      (   
         LISTENER_URL = 'tcp://10.225.115.131:5022',   
         AVAILABILITY_MODE = ASYNCHRONOUS_COMMIT,   
         FAILOVER_MODE = MANUAL,   
         SEEDING_MODE = AUTOMATIC   
      );

And that should be it! If we now connect to the SQL instance in the pod…the database is there!

There it is! OK, one thing I haven’t gone through here is how to get auto-seeding working from Windows into a Linux SQL instance…but I went through the process in my previous post here.

So that’s how to seed a database from a SQL instance that is in a Windows Availability Group into a SQL instance running in a pod in a Kubernetes cluster using a Distributed Availability Group!

Phew! Thanks for reading!

SQL Server Distributed Availability Groups and Kubernetes

Rate

Share

Share

Rate