SQL AlwaysOn FCI in Azure IaaS Cloud with StarWind Virtual SAN Solution

1 – Introduction

This article explains how to deploy an SQL AlwaysOn FCI in Azure IaaS Cloud in SAN-less mode (without Shared Storage like SAN, NAS …) with StarWind Virtual SAN solution.

 

 

 

 

 

StarWind Virtual SAN allows to present Shared Volume to a WSFC Cluster from two or more nodes without physical shared storage solution (SAN, NAS …). The DAS Storage (Physical or Virtual) of each node is used to create clustered volume managed by VirtualS AN. It can be used for Hyper-V clusters, SQL Cluster, VMware cluster …

For SQL, the advantage of this solution is that it is possible to deploy a SQL AlwaysOn Failover Cluster Instance (which requires only SQL Server Standard version licenses) instead of a SQL AlwaysOn AAG cluster (which requires SQL Enterprise licenses => more expansive).

Links:

StarWind Virtual SAN: https://www.starwindsoftware.com/starwind-virtual-san/fr

StarWind Resource Library: https://www.starwindsoftware.com/resource-library

Azure Marketplace VM: https://azure.microsoft.com/en-us/marketplace/partners/starwind/starwindvirtualsan-starwindbyol/

Overview of Architecture:

Schema - Azure - SQL AlwaysOn FCI & Virtual SAN - Overview

I will deploy three VM in azure:

  • One AD DC/DNS Server used to create a forest.
  • Two SQL Server nodes in cluster (with SQL Server 2014 STANDARD Edition).

Note:

  • In Azure, you can also directly used the Azure SQL Database service (based on AlwaysOn), the infrastructure is managed by Azure (PaaS mode).
  • You can also deploy an SQL AlwaysOn Availability Group, but this feature require SQL Server ENTERPRISE licenses (more expansive)

SQL VM Prerequisites:

To deploy StarWind Virtual SAN Solution, each VM must have:

  • Two NIC (minimum two subnets are required : one for Data Synchronization and one for Heartbeat)
  • One dedicated VHDX (this is not required, you can create Shared volume on each System volume of VM (C:\) but this is not a Best Practice).

StarWind Virtual SAN overview:

Virtual SAN must be installed on all nodes that will participates to the cluster.

On each node a Virtual SAN volume is created (file extension: SWDSK), Virtual SAN will replicate this volume between the two nodes (Synchronous replication). The Virtual SAN Volume will be presented to cluster nodes through iSCSI protocol with the use of MPIO.

Note about Cache: There is different Cache mode configuration applicable on a Virtual SAN Volume, this part is not covered in this article, but for example you can configured cache on SSD disk on each node to accelerate your IOPS.

You can also configure Virtual SAN Volume in Thick or Thin-provisioned mode.

Several Virtual SAN volume can be created on a same volume, this is just a question of Performance/Usage.

Configuration:

In this article, each SQL VM will be configured with two VHDX (one for system and one to host Virtual SAN Volumes). I will configure two Virtual SAN volumes: 1x dedicated for the cluster quorum and one dedicated for the SQL FC Instance Data (DB + LOG).

Overview of Virtual SAN clustered disks and ISCSI configuration:

Schema - Azure - SQL AlwaysOn FCI & Virtual SAN - iSCSI

1.1 – Azure environment preparation

The environment will be composed:

  • 1x Azure subscription (for reminder you can create a trial account with 150€ available for 30 days).
  • 1x Azure Resource Group:

A RG is a logical container used to regroup Azure resources associated to an application. It provides the centralized management and monitoring of these resources (lifecycle, cost calculation, provisioning, access control …)

name

type

location

RG-TCLAB1 Resource Group West Europe
  • 1x Azure Storage Account (required to host VM VHDX):

name

type

resource group

account type

tclab1storage Storage Account RG-TCLAB1 Standard-LRS (Locally Redundant)
  • 1x Virtual Network (VNET) with three subnets:

name

type

resource group

address space

subnets

description

tc-lab1-lan Virtual Network RG-TCLAB1 172.16.0.0/16 Prod 172.16.0.0/24 PROD Subnet
Gateway (Azure) 172.16.1.0/29 Used for VPN (P2S or S2S)
Heartbeat 172.16.10.0/24 Cluster /Virtual SAN Heartbeat
  • 2x Cloud Service. Just for reminder all VM in a Cloud Service must have the same number of NIC. So with two CS, I don’t need to create the AD DC VM with the Heartbeat VLAN. In addition CS allow scalability option.

name

type

resource group

description

tc-lab1-cs Cloud Service RG-TCLAB1 Used for basic servers (AD DC …)
tc-lab1-cs-sqlsrv Cloud Service RG-TCLAB1 Used for SQL Servers
  • 3x Virtual Machine

name

type

resource group

dns name

pIP

size

description

l1-dc-1 Virtual Machine RG-TCLAB1 tc-lab1-cs.cloudapp.net 172.16.0.4 Basic A0 (0.25 Core, 0.75 GB) AD DC / DNS Server
l1-sqlfci-1 Virtual Machine RG-TCLAB1 tc-lab1-cs-sqlsrv.cloudapp.net 172.16.0.5 Standard A3 (4 Cores, 7 GB) SQL AlwaysOn FCI Node 1
l1-sqlfci-2 Virtual Machine RG-TCLAB1 tc-lab1-cs-sqlsrv.cloudapp.net 172.16.0.6 Standard A3 (4 Cores, 7 GB) SQL AlwaysOn FCI Node 2
  • The two SQL nodes will be created with two vNIC and two VHDX.

If you begin with Azure read my detailed article (explains all steps to create the Azure environment: VM creation, Virtual Network configuration …): How to create an Azure Cloud environment.

2 – Installation of StarWind Virtual SAN

In this part I will configure two Virtual Volume (Replicated between the two SQL nodes). At the end both volumes can be added to the WSFC Cluster.

You can use the procedure bellow to configure CSV Disk for Hyper-V Cluster or any other WSFC Cluster.

Note: You can install only “Virtual SAN” components on Servers that will participate to the Replication and install the “Management console” component on an administration server or client.

On the first SQL Server, launch the Setup, select “StarWind Virtual SAN Service” and check the “Management Console”:

Enter your license:

https://www.starwindsoftware.com/registration-starwind-virtual-san

Click Finish:

Note that a Firewall Rule is created:

During the first start, the Management Console ask to configure the Storage Pool for Virtual Disk:

Select the disk previously prepared:

Repeat the operation on the second SQL Server and close the console.

2.1 – Virtual Disk “Quorum” – Creation

From the first SQL server, select it and click “Connect”:

Click the “Add Device (advanced)“:


Select “Hard Disk Device”:


Select Virtual disk and click “Next“.


Check the virtual disk location, change the name and specify the size:


Select “Thick-provisioned”:

Configure the cache policy and specify the cache size.

Note: StarWind recommends to use 1GB cache per 1TB storage.



Define the L2 cache policy if needed.

Note: StarWind recommends to use SSD for L2 cache and if it will be used, the formula is 1GB (sum of L1 and L2) cache per 1TB storage.

Enter a “Target Alias”, if you want you can change the Target Name:


Click “Create”:

Wait for completion and click Close:

Note that on the disk, two files (.img and .swdsk) are created:

2.2 – Virtual Disk “Quorum” – Configure Replication

Select the device you just created and click “Replication Manager“.


In the “Replication Manager” Window click “Add Replica“. Select “Synchronous two-way replication” and click “Next“:

Enter the FQDN of the second SQL server:

Select the target disk for the second SQL Server:

Same disk as the first SQL Server:

Choose “Create new Partner Device
and click Next.

Leave default options (check the driver letter)

Click “Change network settings“:


Select a Network for “Synchronization and HA” and the other Network for Heartbeat:

Note: if you have more than two networks, you can configure several networks for Synchronization/HA flows (or Heartbeat).


You can also modify the ALUA (Asymmetric Logical Unit Assignment / Multipathing method) settings, but it”s recommended to keep “ALUA Optimized” for the both targets.


Click “Create Replica”:

Click “Close”:

Wait for the end of Synchronization and click “Close”:

On the Management Console, add the second SQL Server to manage it:

Click “Add Server”:

Enter “FQDN”:

Click “Connect”:

Now we can see the status of our replicated Virtual Disk. You can see that the second SQL server priority is set to “Second” and you can retrieve the local path of Virtual disk:

Note that the “Image1” name as change to “HAImage1”.

2.3 – Virtual Disk “SQL Data” – Creation & Replication Configuration

Repeat the procedure to create disk for “SQL DATA”:

Then configure Replication:

Wait for Synchronization and click “Close”:

So now the two disk are ready and after iSCSI configuration we can add them to WSFC Cluster:

3 – Enable iSCSI MPIO

To allow iSCSI multipath (configuration of several paths between iSCSI initiator and target) we must configure the MPIO Driver (installed previously) on both nodes

Start the MPIO Console: mpiocpl.exe

Go to “Discover Multi-Paths”, select “Add support for iSCSI devices” and click “Add”:

Restart Computer

After restart, re-run “MPIOCPL” and valid that “MSFT2005iSCSIBusType_0x9” is added to “Devices”:

4 – Configure iSCSI

The last step is to configure iSCSI Initiator on both node to present the two target disk in multipath mode.

On each SQL node (2x iSCSI Target for the Quorum disk + 2x iSCSI Target for the SQL-Data Disk)

Reminder:

Hostname IP Prod IP Heartbeat
l1-sqlfci-1 172.16.0.5 172.16.10.5
l1-sqlfci-2 172.16.0.6 172.16.10.6

4.1 – Present Disks to SQL Node 1

On the first SQL Server (L1-SQLFCI-1), start the “iSCSI initiator” configuration: iscsicpl

Configure Discovery

Go to “Discovery” tab and click “Discover Portal…”:

First add the host himself, enter the loopback address and click “Advanced”:

Select “Microsoft iSCSI Initiator” and set the Initiator IP to “Default”:

Next repeat the procedure to add the other SQL Server:

Enter the IP address of the second SQL Server, click “Advanced” and set the Initiator IP to the local IP of the server.

For more High Availability, you can also add the Heartbeat network:

Connect Targets

Go to the “Targets” tab. You can see that the two disk are listed (two different path) and the connection are “Inactive”.

Select the first target on the local server himself and click “Connect”:

Check “Enable multi-path”, click “Advanced” and configure the Initiator IP to default and the Target Portal on the loopback:

Repeat the procedure for the second path of the Quorum disk. Set the Initiator IP to the local IP of the Server and the Target Portal IP to the IP of the other SQL Serve:

Repeat the procedure for the second iSCSI target (SQL-Data Disk). The first path on the local server and the second path on the other Server. At the end all the targets status must be “Connected”:

Configure MPIO

Select the first Target of the “Quorum disk” and click “Devices…”

Click “MPIO”:

By default the “Load balance policy” is configured to “Round robin”.

Change it to “Fail Over Only” and check that the Active path is the localhost => Select the Active path, click “Details” and control the Source/Target Portal:

Repeat the same procedure for the second Target “SQLData”:

4.2 – Present Disks to SQL Node 2

Repeat the Full procedure to configure iSCSI Targets on the second SQL Server

Configure Discovery:

Connect all Targets (for each disk, one locally and the second to the other server):

Configure MPIO “Load Balance Policy” to “Fail Over Only” on the both targets:

Quorum Disk:

SQLData Disk:

5 – Prepare Disk

Now, we can see that the two volumes are visible by both SQL node (with multipath):

On one node, initialize both disks:

Now we are ready to mount the SQL Cluster!

6 – Create the WSFC Cluster

Go to the first node and start the WSFC Console and select “Validate cluster Configuration.

Add the two SQL Nodes:

Run all tests, check “Create the cluster now…” and click Finish:

Enter the Cluster Name:

Note that the Cluster IP is configured in DHCP Mode.

Uncheck “Add all eligible storage to cluster”:

Click Finish:

At the end of the cluster configuration, there is an error:

This is “normal” on Azure, this issue is due to the DHCP mode for the Cluster IP, the Cluster retrieves the same IP as the node where the cluster is created, so there is an IP Conflict.

Go to the second SQL node and start the WSFC Console:

Edit the IP Cluster Core resource:

Change IP to a Static IP (there is no way for the moment to reserve it on Azure):

Click “Yes”:

6.1 – Configure Network

Go to Network, rename them and check the configuration (Cluster Use):

6.2 – Add Disk to cluster

Go to Storage and select “Add Disk”:

Add the two Virtual Disks (managed by StarWind Virtual SAN):

Start the “Server Management” console and create new Volume on these Disks:

Create Quorum Volume (Q:):

Create “SQL Data” Volume (G:):

6.3 – Configure Quorum

Edit “Cluster Quorum Settings”:

Select “Advanced”:

Keep “All Nodes” for “Voting Configuration” and select “Configure a disk witness”:

Select the Q: Volume and click “Finish”:

So now we have one Disk used for Cluster Quorum and one disk available for SQL Server:

7 – SQL Server – Install first Node

On the first node, mount the SQL Server 2014 (or 2012) Standard ISO and select “Installation\New SQL Server failover cluster installation”:

Select “Database Engine Services” and “Management Tools – Basic”:

Select “Default instance” and enter a SQL Server Network Name:

Keep or change the SQL Cluster group:

Select the Cluster Disk:

Configure an IP Address for the SQL cluster:

Configure you service accounts:

Set “Collation”:

Configure your Authentication mode and Administrators Group:

Configure the SQL path to the Cluster Disk (except for the TempDB, select a local path). Normally you should configure TempDB, Log … on separate disks):

Click “Yes”:

Start Installation:

Wait for installation and click “Close”:

Now, the SQL Cluster Instance is ready with the Clustered disk:

8 – SQL Server – Install the second Node

Go to the second node, mount SQL ISO and select “Add node to a SQL Server failover cluster”:

Check the Cluster Node and Network Configuration:

Configure Service Accounts:

Click “Install” and wait for completion:

9 – Connect to Instance

Ok, so now if you try to connect the instance directly from a node, it’s OK:

But if you try to connect from a client, you get an error:

This is normal, in Azure you cannot connect directly to a cluster, you have to configure an ILB (Internal Load Balancer). To access the SQL Cluster Instance clients must connect to the ILB instead of the Cluster IP.

10 – Create an Internal Load Balancer (ILB)

Start “Azure PowerShell” and run:

1 – Create the ILB:

Change variables with your parameters and choose a Load balanced Set name.

$SQLClusterName = "sqlfci-1"
$SQLClusterIP = "172.16.0.115"
$CloudServiceSQL = "tc-lab1-cs-sqlsrv"
$LBSetName = "LBS-SQLFCI"
Add-AzureInternalLoadBalancer -InternalLoadBalancerName $SQLClusterName -SubnetName "default" -ServiceName $CloudServiceSQL –StaticVNetIPAddress $SQLClusterIP

Note: Check ILB in a Cloud Service

Get-AzureInternalLoadBalancer -ServiceName "tc-lab1-cs-sqlsrv" -Verbose:$false

Note: Get VM Azure Endpoint

Get-AzureVM -Name "l1-sqlfci-1" -ServiceName "tc-lab1-cs-sqlsrv" -Verbose:$false | Get-AzureEndpoint | ft Name,protocol,port,localport,ProbeProtocol,ProbePort, ProbeIntervalInSeconds,InternalLoadBalancerName -AutoSize

By default, on a VM a PS and RDP endpoint are created:

2 – Add load balanced endpoint to the first cluster SQL node:

Note: Choose your own “Probe Port” (here: 311433). The same Probe Port must be configured on endpoint on both VM and on the SQL IP Address cluster resource.

Get-AzureVM -ServiceName $CloudServiceSQL -Name "l1-sqlfci-1" | Add-AzureEndpoint -Name "SQL" -LBSetName $LBSetName -Protocol "TCP" -LocalPort 1433 -PublicPort 1433 -ProbePort 31433 -ProbeProtocol tcp -ProbeIntervalInSeconds 10 –DirectServerReturn $true -InternalLoadBalancerName $SQLClusterName | Update-AzureVM

Now if we check Endpoints:

3 – Add load balanced endpoint to the second SQL cluster node:

Get-AzureVM -ServiceName $CloudServiceSQL -Name "l1-sqlfci-2" | Add-AzureEndpoint -Name "SQL" -LBSetName $LBSetName -Protocol "TCP" -LocalPort 1433 -PublicPort 1433 -ProbePort 31433 -ProbeProtocol tcp -ProbeIntervalInSeconds 10 –DirectServerReturn $true -InternalLoadBalancerName $SQLClusterName | Update-AzureVM

MSDN Links:

  • LoadBalancerProbe Schema: https://msdn.microsoft.com/en-us/library/azure/jj151530.aspx
  • Add-AzureEndpoint: https://msdn.microsoft.com/en-us/library/azure/dn495300.aspx

NOTE – View Load Balanced Set configuration through the Azure Portal:

Edit VM Settings. You can see the new “Load-Balanced” endpoint:

Got to “Load balanced sets”, you can see the ILB:

In addition, you can edit the Load Balanced set:

View of member VM and you can manage ACL:

10.1 – Configure SQL IP address Cluster Resource

Now the last step is to configure Cluster.

For reminder, during the SQL setup, I set a static IP Address 172.16.0.115 on the SQL Server instance role and I configure the ILB with the same IP. The last step is to add the probe port defined in the ILB to the SQL IP resource cluster.

On a cluster node, start a PowerShell console with Elevated privileges. Retrieve the name of the resource “IP Address” of the SQL Server cluster group:

Configure Probe Port (here 31433) on the SQL IP Address cluster resource:

Get-ClusterResource "SQL IP Address 1 (sqlfci-1)" | Set-ClusterParameter -Multiple @{Address="172.16.0.115";ProbePort="31433";SubnetMask="255.255.255.255";Network="Cluster Network - PROD";OverrideAddressMatch=1;EnableDhcp=0}

Check the SQL IP Address cluster resource configuration:

Get-ClusterResource "SQL IP Address 1 (sqlfci-1)" | Get-ClusterParameter

Note: Probe port’s job is to find out which is the active node that hosts the IP Address (SQL Role) in the Cluster. Load Balancer sends the probe pings over TCP port 31433 to every node in the cluster (by default every 10 seconds)

For more information about ILB and Probe Port configuration, read this excellent article: https://blogs.technet.com/b/askcore/archive/2015/06/24/building-windows-server-failover-cluster-on-azure-iaas-vm-part-2-network.aspx

Restart the SQL Server Cluster Role:

Now you can connect to the SQL Cluster Instance through a Client.

About Gilles Monville

Gilles Monville is a passionate of IT for many years; he works for over 10 years on Microsoft and VMware environments. In addition to Virtualization and Cloud computing, he loves automation (Orchestrator, PowerShell, Cloud deployment …)

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

x

Check Also

Deploy Veeam Cloud Connect for large environments in Microsoft Azure

Veeam Cloud Connect is a solution to store backups and archives in a second datacenter ...

RDS 2016 farm: RDS Final configuration

This article is the final topic about how to deploy a Remote Desktop Service in ...

RDS 2016 Farm: Configure File Servers for User Profile Disks

In the previous topics of this series, we have deployed the RDS Farm in Azure. ...