Learn about the things to know about Update Domains in Azure


In Azure, Update Domains (UDs) play a crucial role in maintaining high availability during planned maintenance and platform updates.

Understanding how update domains work and how to configure them effectively is essential for ensuring that your applications experience minimal downtime during updates, reboots, or patches.

Here’s a comprehensive review of what you need to know about Update Domains in Azure.

Definition of Update Domain (UD)

An Update Domain (UD) is a logical grouping of Azure virtual machines (VMs) that are updated (patched, rebooted, etc.) together during planned maintenance or platform updates.

Azure performs updates on VMs within an Update Domain one at a time, ensuring that not all VMs in an Availability Set or Virtual Machine Scale Set (VMSS) are impacted by planned updates at once.

This minimizes downtime and ensures the continuous availability of your application.

Purpose of Update Domains

The primary purpose of update domains is to provide resilience during planned maintenance events.

Key Points

Isolation during updates

VMs in different update domains are not updated simultaneously.

This isolation ensures that if one update domain is being updated (e.g., patched or rebooted), VMs in other update domains continue to operate without interruption.

Preventing full downtime

During scheduled maintenance events (like operating system patches), Azure only updates one update domain at a time.

If you have multiple update domains, your application remains available, even if one update domain is affected.

How Update Domains Work

When you deploy VMs in an Availability Set or Virtual Machine Scale Set (VMSS), Azure automatically divides the VMs into multiple update domains.

During platform maintenance, only the VMs in the update domain currently being updated will be impacted (rebooted, patched, etc.).

VMs in other update domains will remain unaffected and continue to serve traffic.

Example

Availability Set with 3 Update Domains

If you have 6 VMs in an Availability Set with 3 update domains, Azure will distribute the VMs across the update domains like this:

  1. Update Domain 1: VMs 1, 2

  2. Update Domain 2: VMs 3, 4

  3. Update Domain 3: VMs 5, 6

During an update, Azure will first update Update Domain 1 (reboot or apply patches to VMs 1 and 2), while VMs in Update Domain 2 and 3 continue running.

Once Update Domain 1 is updated, Azure will proceed to Update Domain 2, and so on.

Azure's Maintenance Cycle

This ensures that not all VMs are rebooted or patched simultaneously.

Therefore, only a portion of the VMs are unavailable at any given time, thus maintaining the availability of the application.

How to Configure Update Domains

Setting Update Domains in Availability Sets

When creating an Availability Set, you can specify the number of update domains to divide the VMs across.

Azure typically allows up to 20 update domains, depending on the size of the deployment.

Minimum Update Domains

You must have at least 2 update domains to ensure that VMs are not all affected during planned maintenance, ensuring the availability of your application.

Best Practices

For most scenarios, aim for at least 3 update domains for better fault tolerance, though the optimal number depends on the size of your application and how critical it is for uptime.

Setting Update Domains in VM Scale Sets (VMSS)

When creating VM Scale Sets, Azure will automatically distribute the VMs across multiple update domains to provide resilience during updates.

VMSS can have up to 20 update domains depending on the number of instances and scale of the deployment.

Update Domain Limits

Maximum Number of Update Domains

Azure allows a maximum of 20 update domains for a single Availability Set or VMSS deployment.

The actual number of update domains may vary based on the region, VM size, and other factors.

For large-scale deployments (with hundreds or thousands of VMs), Azure VM Scale Sets (VMSS) might offer a more scalable solution for managing update domains.

Effect on VM Availability

The more update domains you configure, the more granular control you have over when and how your VMs are updated, allowing you to balance updates across multiple instances without service interruption.

Impact of Update Domains on SLAs

SLA Considerations

For 99.95% uptime SLA, Azure requires that VMs be deployed in an Availability Set with at least 2 update domains.

With multiple update domains, your application has higher availability during platform updates because not all VMs will be affected by the updates at once.

SLA Example

VMs in an Availability Set with 2 Update Domains

In this case, if one update domain undergoes a planned maintenance event, the other update domain will still be available, maintaining the 99.95% SLA.

VMs in an Availability Set with 3 or more Update Domains

The more update domains you have, the better protected you are from downtime during planned updates, ensuring even higher availability.

Role of Load Balancers

Azure Load Balancer is essential when using multiple update domains to ensure that traffic is routed to healthy VMs during planned maintenance or updates.

If one update domain is being updated, the Load Balancer will ensure that traffic is only directed to the healthy VMs in other update domains.

Load Balancer Configuration

Health Probes

Configure health probes to ensure the Load Balancer only sends traffic to healthy VMs.

Health probes will detect if a VM in a specific update domain is undergoing maintenance and redirect traffic to the available VMs.

Automatic Failover

If a VM in an update domain is down (due to maintenance or failure), the Load Balancer can automatically failover to healthy instances in other update domains.

Best Practices for Update Domains

Use Multiple Update Domains

Always deploy VMs across multiple update domains to avoid having all VMs in your Availability Set or VMSS go down during maintenance.

At least 3 update domains is a good practice to improve resilience and availability.

Spread Critical Workloads Across Update Domains

Ensure that critical workloads are spread across different update domains.

This way, if one update domain goes down during maintenance, the other update domains can still serve traffic.

Use Azure Load Balancer

Configure an Azure Load Balancer to distribute traffic across healthy VMs in multiple update domains.

This ensures that users experience minimal disruption during updates.

Monitor with Azure Monitor

Azure Monitor provides insights into the health of VMs and allows you to track when updates or maintenance occur, so you can respond quickly if any issues arise.

Balance Between Fault and Update Domains

A well-balanced distribution of VMs across fault domains and update domains will ensure that both planned and unplanned downtime have minimal impact on your application.

Test and Verify

Regularly test your application and update strategies to ensure that the planned maintenance processes work as expected without affecting the overall availability or performance.

Differences Between Update Domains and Fault Domains

AspectFault Domain (FD)Update Domain (UD)
DefinitionA grouping of VMs across different physical hardware (racks) to protect against hardware failuresA grouping of VMs that are updated together during maintenance
PurposeTo protect against hardware failures (e.g., server crashes, power failures)To protect against downtime during planned updates (e.g., patches)
Impact of FailureIf a fault domain fails (e.g., hardware failure), VMs in that domain are affectedIf an update domain is being updated, VMs in that domain are affected (e.g., reboot, patch)
Number of DomainsAzure allows up to 3 fault domains in most regionsAzure allows up to 20 update domains in most regions
Example Use CaseSpread VMs across fault domains to protect against hardware issuesSpread VMs across update domains to ensure minimal downtime during updates
ManagementManaged by Azure for hardware redundancyManaged by Azure for software update redundancy

Summary

Update Domains in Azure are crucial for maintaining the availability of your application during planned maintenance and platform updates.

They ensure that not all VMs in an Availability Set or VM Scale Set are updated at the same time, which protects your application from downtime caused by reboots or patching.

By configuring multiple update domains (typically between 2 and 20), you can ensure minimal impact on application availability during updates.

Always use a Load Balancer to distribute traffic across healthy VMs and monitor your VMs to manage updates effectively.

 

Related Articles


Rajnish, MCT

Leave a Reply

Your email address will not be published. Required fields are marked *


SUBSCRIBE

My newsletter for exclusive content and offers. Type email and hit Enter.

No spam ever. Unsubscribe anytime.
Read the Privacy Policy.