Understanding Azure Service Level Agreements and the Resource Requirements Part 1 – Virtual Machines

Achieving High Availability and Uptime with Virtual Machines

Introduction

In today’s mission-critical environments, uptime is non-negotiable. Azure’s Service Level Agreements (SLAs) for Virtual Machines are designed to guarantee 99.95% (or higher) availability—but meeting these promises depends on smart architectural choices. In this installment, we explore the requirements for achieving high availability with Azure Virtual Machines. We’ll dive into the importance of deploying VMs within Availability Sets or across Availability Zones, setting up load-balancing configurations, and implementing redundancy and auto-scaling. Whether you prefer a hands-on approach using the Azure Portal or an Infrastructure-as-Code solution with Bicep, we’ve got you covered with step-by-step guidance.

Focus Areas

  • Understanding the Baseline SLA for Virtual Machines:
    The storage type used for a Virtual Machine directly impacts its baseline SLA. For instance, a VM running on Standard HDD has a minimum SLA of 95%. Upgrading to Standard SSD improves that guarantee significantly—to around 99.5%—and with Premium SSDs, the SLA can reach up to 99.9%. These figures represent the fundamental reliability provided by the underlying hardware alone.

  • Enhancing SLAs Through Deployment Strategies:
    Beyond storage selection, the overall architecture can further elevate uptime:

    • Availability Sets vs. Availability Zones:
      Deploying VMs within an Availability Set ensures that instances are spread across multiple fault and update domains, typically achieving an SLA of 99.95%. Taking it a step further, distributing your VMs across Availability Zones—where each zone is a physically separate data center—can push availability even higher, often approaching 99.99%. This minimizes the risk of a single point of failure.

    • Load-Balancing Configurations:
      Implementing a load balancer to distribute traffic among multiple VM instances ensures that if one instance goes offline due to maintenance or unexpected failure, the service remains uninterrupted. When combined with either Availability Sets or Zones, load balancing helps maintain continuous operation and bolsters the overall SLA.

    • Redundancy and Auto-Scaling:
      Building redundancy by deploying multiple VM instances and enabling auto-scaling ensures that your system dynamically adjusts to demand and recovers from outages. This additional layer of resiliency further solidifies the high availability of your services.

For Example: A single VM on Standard HDD might only deliver a 95% SLA. However, if you upgrade to Premium SSD, that baseline improves to 99.9%. Now, if you deploy multiple such VMs across Availability Zones and front them with a load balancer, your SLA can reach up to 99.99%, providing a significantly more resilient solution.

Example Implementation

A. Using the Azure Portal

Deploy VMs in an Availability Set:

  • Step 1: Sign in to the Azure Portal.
  • Step 2: Click Create a resource and select Virtual Machine.
  • Step 3: In the Basics tab, fill in your VM details (name, region, image, size, etc.).
  • Step 4: Under Availability options, select Availability set and either choose an existing set or create a new one. (A typical configuration may include 2 fault domains and 5 update domains.)
  • Step 5: Complete your settings and deploy the VM.
  • Step 6: Repeat to deploy additional VM instances to achieve redundancy.

B. Using Bicep

For repeatable and automated deployments, consider using this sample Bicep template to build a highly available Virtual Machine setup:

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
// Parameters
@description('Prefix for the Virtual Machine names')
param vmNamePrefix string = 'myvm'

@description('Location for all resources.')
param location string = resourceGroup().location

@description('Admin username for the Virtual Machines')
param adminUsername string = 'azadmin'

@description('Admin password for the Virtual Machines')
@secure()
param adminPassword string

@description('Number of Virtual Machines to deploy')
param vmCount int = 2

@description('Name of the Virtual Network')
param virtualNetworkName string = 'myVNet'

@description('Name of the Subnet')
param subnetName string = 'mySubnet'

@description('Network Interface Prefix Name')
param networkInterfaceName string = 'nic'

// Create an Availability Set to support high availability
resource availabilitySet 'Microsoft.Compute/availabilitySets@2024-11-01' = {
  name: '${vmNamePrefix}-avset'
  location: location
  properties: {
    platformUpdateDomainCount: 5
    platformFaultDomainCount: 2
  }
  sku: {
    name: 'Aligned'
  }
}

var subnetRef = resourceId('Microsoft.Network/virtualNetworks/subnets', virtualNetworkName, subnetName)

resource virtualNetwork 'Microsoft.Network/virtualNetworks@2023-09-01' existing = {
  name: virtualNetworkName
}

resource networkInterface 'Microsoft.Network/networkInterfaces@2023-09-01' = [for i in range(0, vmCount): {
  name: '${networkInterfaceName}-${vmNamePrefix}-${i}'
  location: location
  properties: {
    ipConfigurations: [
      {
        name: 'ipconfig1'
        properties: {
          privateIPAllocationMethod: 'Dynamic'
          subnet: {
            id: subnetRef
          }
        }
      }
    ]
  }
  dependsOn: [
    virtualNetwork
  ]
}]

// Loop to deploy multiple Virtual Machines within the Availability Set
resource vms 'Microsoft.Compute/virtualMachines@2024-11-01' = [for i in range(0, vmCount): {
  name: '${vmNamePrefix}-${i}'
  location: location
  properties: {
    hardwareProfile: {
      vmSize: 'Standard_DS1_v2'
    }
    osProfile: {
      computerName: '${vmNamePrefix}-${i}'
      adminUsername: adminUsername
      adminPassword: adminPassword
    }
    storageProfile: {
      imageReference: {
        publisher: 'MicrosoftWindowsServer'
        offer: 'WindowsServer'
        sku: '2025-Datacenter'
        version: 'latest'
      }
      osDisk: {
        createOption: 'FromImage'
      }
    }
    networkProfile: {
      networkInterfaces: [
        {
          id: networkInterface[i].id
        }
      ]
    }
    availabilitySet: {
      id: availabilitySet.id
    }
  }
}]

Deployment Instructions (Bicep):

  1. Save the above code as vmHighAvailability.bicep.

  2. Open your terminal, log in with Azure CLI:

    1
    
    az login
    
  3. Deploy the template:

    1
    2
    3
    
    az deployment group create \
      --resource-group MyResourceGroup \
      --template-file vmHighAvailability.bicep
    

This template provisions an Availability Set and deploys the specified number of Virtual Machines in that set, providing a strong foundation for high availability in alignment with Azure’s SLAs.

Conclusion

Understanding and meeting Azure’s SLA requirements for Virtual Machines is vital for building resilient, mission-critical applications. By deploying VMs within Availability Sets (or across Availability Zones), implementing load balancers, and incorporating redundancy and auto-scaling, you can achieve the 99.95% uptime and even surpass it. Whether you choose the hands-on approach of the Azure Portal or the efficiency of Infrastructure-as-Code with Bicep, informed architectural decisions pave the way for reliable and high-performing systems.

Learn More