Azure High Availability and Disaster Recovery Options

Problem

You may be familiar with high availability (HA) and disaster recovery (DR) features that are available in SQL Server and have experience configuring and managing them. But you have ever heard of or tried Azure high availability or Azure disaster recovery features. How can I learn more about what Azure brings in terms of HA and DR for Azure SQL offerings – including SQL VMs?

Solution

In Part 1, we have reviewed “classic” HA and DR features that are available in SQL Server (whether it is on-premises, or in the cloud) – such as Availability Groups, Replication, Link, and so on, some of which also work with Azure SQL offerings like SQL DB or SQL MI.

But there are also features that are specific to the Azure platform only. In this article, we will review them one by one with explanations, details, use cases, describe pros and cons for each of those features.

Azure High Availability and Disaster Recovery Options

Here is the list of Azure features that we are going to discuss:

High Availability in Azure SQL Database
High Availability in Azure SQL Managed Instance
High Availability in SQL Server on Azure Virtual Machine
Zone Redundancy
Availability Set
Geo-Backups and Geo-Restore
Active Geo-Replication
Failover Groups
Azure Site Recovery
Azure Backup

Don’t miss the feature summary table and recommendations at the end of the article.

High Availability in Azure SQL Database

Azure SQL Database and Azure SQL Managed Instance, as PaaS solutions, have embedded high availability.

Azure SQL Database, for example, has 99.99% uptime SLA for all service tiers, and 99.995% uptime SLA for tiers that support zone-redundancy when it is enabled.

Purchasing Model	Azure SQL Database Service Tier	Storage Model	SLA (no zone-redundancy)	SLA (WITH zone-redundancy)
DTU-based	Basic	remote	99.99 %	N/A
DTU-based	Standard	remote	99.99 %	N/A
DTU-based	Premium	local	99.99 %	99.995 %
vCore-based	General-Purpose	remote	99.99 %	99.995 %
vCore-based	Business Critical	local	99.99 %	99.995 %
vCore-based	Hyperscale	remote	99.99 %	99.995 %

HA is achieved by having one or more standby compute nodes (or secondary replicas) ready for failover, should anything happen to active node or primary replica such as a database engine crash, hardware failure, maintenance event, etc. This is called high availability through redundancy.

Standby nodes / replicas

The number of standby nodes / replicas depends on the service tier:

Purchasing Model	Azure SQL Database Service Tier	Nodes / Replicas	Readable Secondaries
DTU-based	Basic	1 active + 1 standby	N/A
DTU-based	Standard	1 active + 1 standby	N/A
DTU-based	Premium	1 primary + 2-3 secondary	1
vCore-based	General-Purpose	1 active + 1-2 standby	N/A
vCore-based	Business Critical	1 primary + 2-3 secondary	1
vCore-based	Hyperscale	1 primary + 0-4 secondary	multiple

There are two models: “remote storage” and “local storage”.

Basic, Standard, and General-Purpose tiers use a “remote storage” model which is a single active compute node with one or two passive standby nodes ready for failover. Remote storage is attached to active node, and if failover happens, it is re-attached to passive node, which becomes active. This is similar to Fail Over Clustering (FCI) on SQL Server. These tiers do not support read scale-out, as passive nodes do not have resources, such as CPU and memory “provisioned” yet, until the failover occurs.

Premium and Business Critical tiers use a “local storage” model and a technology similar to “Always On Availability Groups”: 1 primary + 2-3 secondaries, each of which has its own local storage. One of the secondaries can be used to offload reporting, which is read scale-out. Data replication to secondary replicas uses synchronous commit mode, which is how no data loss is achieved.

The Hyperscale service tier uses a distributed system of highly available components such as compute nodes, page servers, log service, and persistent storage. However, official SLA doesn’t exceed 99.99 % – 99.995 % for Hyperscale, too.

Additional Information:

High Availability in Azure SQL Managed Instance

Azure SQL Managed Instance has 99.99% uptime SLA for General Purpose and Business Critical tiers. Similar to Azure SQL Database, SLA can be increased to 99.995% when zone-redundancy is enabled. Since June 2025, General Purpose tier also supports zone-redundancy.

Azure SQL Managed Instance	Service Tier	Storage Model	SLA (no zone-redundancy)	SLA (WITH zone-redundancy)
Azure SQL Managed Instance	General-Purpose	remote	99.99 %	99.995 %
Azure SQL Managed Instance	Business Critical	local	99.99 %	99.995 %

The General-Purpose service tier uses the “remote storage” model, when the active node accesses the remote storage – which has its own redundancy. Should the active compute node fail, Azure Service Fabric initializes database engine and allocates resources (vCores, memory, etc.) to a new compute node, and blob storage (remote storage) is attached to a new active node.

The Business Critical service tier uses technology similar to “Always On Availability Groups” and a “local storage” model, where secondary replicas are pre-allocated and each has its own local storage. Both compute layers and storage layers are replicating data from primary to secondary. One of the secondary replicas can be used for read-only queries.

Azure SQL Managed Instance	Service Tier	Nodes / Replicas	Readable Secondaries
Azure SQL Managed Instance	General-Purpose	1 active + 1-2 standby	N/A
Azure SQL Managed Instance	Business Critical	1 primary + 3 secondary	1

Additional information:

Availability Through Redundancy – Azure SQL Managed Instance

High Availability in SQL Server on Azure Virtual Machines

For SQL Server on Azure VM, uptime SLA depends on disk type of the virtual machine. If it uses HDD, there is no formal SLA. If it uses Premium SSD, then the VM has 99.9% SLA. It is the highest SLA that can be achieved for standalone VM without any additional HA configuration.

To increase the SLA for SQL VM, you need to use either availability sets or availability zones. The VM cannot participate in both at the same time.

Availability set – VMs added to single availability set are placed to different default and update domains.

Availability zone – Deploy VMs in different zones (data centers) of the same region, this gives higher resilience. For example, when 2 virtual machines are deployed to different zones, then you configure an availability group out of them.

Comparison Table

SQL Server on Azure VM	Disk Type	VMs	Additional HA Features	Uptime SLA
SQL Server on Azure VM	Standard HDD	1	no	N/A
SQL Server on Azure VM	Premium SSD	1	no	99.9 %
SQL Server on Azure VM	Premium SSD	2	Availability Set	99.95 %
SQL Server on Azure VM	Premium SSD	2	Availability Zones	99.99 %

Note that 99.95 % and 99.99 % uptime means uptime of the whole solution. For example, Availability Group, or Failover Cluster Instance, depending on whether the availability set is used, or in separate availability zones. This means 2 or more virtual machines in a solution.

So yes, this means achieving same HA SLA in SQL VM, as in Azure SQL Managed Instance, for example, takes a minimum two virtual machines and an availability group, which obviously has higher cost, compared to SQL MI. But, when advocating for SQL VMs, it is worth noting that SQL Server has more features and capabilities compared to SQL MI. Don’t forget the ability to have many replicas in an availability group – SQL DB and SQL MI are somewhat limited in this regard. And disk performance can be much higher.

Additional information:

Zone Redundancy

The zone redundancy feature is aimed at improving uptime SLA and making your database solution more resilient to the possible impact of the following:

Zone-level datacenter outages for example: power failures, cooling issues, hardware failures
Network disruptions affecting a single availability zone
Natural disasters localized to one zone such as fires, floods, earthquake, tornado, etc.
Maintenance or updates that temporarily affect one zone

Zone redundancy applies to Azure SQL Database (depending on service tier) and Azure SQL Managed Instance, allowing them to reach a 99.995 % SLA.

For SQL Server on Azure VM, you can pick separate availability zones for each of virtual machine participating in an availability group, allowing the solution to reach 99.99 % SLA. Which can be considered “manually built” zone-redundancy.

Note that not all of the regions support zone redundancy. When it does, it means that a region has 2 or 3 zones\data centers and enabling zone redundancy gives the ability for the database solution to have replicas\standby nodes in more than 1 zone.

During virtual machine creation, you can pick which zone of the region you want to host your VM. If you select more than 1 zone, then more than one VM will be created, each in a different zone.

Zone Redundancy	SQL Offering	Supported
Zone Redundancy	Azure SQL Database	Yes
Zone Redundancy	Azure SQL Managed Instance	Yes
Zone Redundancy	SQL Server on Azure VM	Kind of

Additional information:

Availability Zones

Availability Sets

Availability Sets are intended to improve uptime SLA. Only SQL VMs support this feature.

Virtual machines added into single availability set, are placed into different default domains and update domains by Azure. They are hosted on separate hardware. This means that two virtual machines can’t go down at the same time due to local failure or maintenance events.

Availability Set help to achieve 99.95 % SLA for a solution with 2 or more virtual machines. Each availability set can have up to 3 default domains and 20 update domains.

Availability Set	SQL Offering	Supported
Availability Set	Azure SQL Database	No
Availability Set	Azure SQL Managed Instance	No
Availability Set	SQL Server on Azure VM	Yes

Additional information:

Availability Sets Overview

Geo-Backup and Geo-Restore

Geo Backup and Geo Restore is a disaster recovery solution for Azure SQL Database and Azure SQL Managed Instance.

In Backup settings, you can configure automated backups to be geo-replicated to a secondary region. This means if the primary region goes down i.e. regional outage or disaster, affecting all availability zones, and database backups in primary region are unavailable as a result, you can recover from geo-replicated backups from a secondary region.

This configuration setting is available in Azure SQL DB and SQL MI, but you also can achieve similar functionality with SQL Server on-premises as well as an Azure virtual machine. This can be achieved by backing up databases to a URL and configuring the target storage account for geo-replication.

Geo-Backup and Geo-Restore	SQL Offering	Supported
Geo-Backup / Geo-Restore	Azure SQL DB	Yes
Geo-Backup / Geo-Restore	Azure SQL MI	Yes
Geo-Backup / Geo-Restore	SQL Server	Backup to URL

Additional information:

Active Geo-Replication

Active Geo-Replication is available in Azure SQL Database only as of September 2025. It is a disaster recovery solution which uses asynchronous replication, and is configured on a per-database level.

Up to 4 geo-secondaries can be configured for single Azure SQL Database. Geo-secondaries can be in different regions, as well as in the same region as the primary.

Failover is manual, and you will need to update application connection strings manually to point to a new primary server. Geo-secondaries are readable, so you can offload read-only workloads.

Geo-secondary databases can be a different size and service tier from the primary database. But keep in mind that much smaller secondary can lag behind high activity on a primary database.

Active Geo-Replication	SQL Offering	Supported
Active Geo-Replication	Azure SQL DB	Yes
Active Geo-Replication	Azure SQL MI	no
Active Geo-Replication	SQL Server	no

Additional information:

Active Geo-Replication

Failover Groups

Failover Groups are a disaster recovery solution for Azure SQL Database on a logical SQL Server level and Azure SQL Managed Instance. The difference from active geo-replication is that with Failover Groups, it is configured on a “group of databases” and that group fails over together.

Also, failover groups have a read-write listener, and read-only listener. You don’t have to update application connection strings after a failover, if the application is using the listener’s address in a string.

There is some difference between failover groups in Azure SQL Database and Azure SQL Managed Instance. On a logical Azure SQL Server that hosts Azure SQL Databases, you can add as many SQL Databases into a failover group as you want. You don’t have to add them all, if you don’t want to.

In Azure SQL Managed Instance though, you can’t select individual databases. All databases of a SQL MI instance become part of the failover group. Important – after the Failover Group is created on the secondary SQL MI instance, you cannot create read-write databases anymore.

When we say “secondary” logical Azure SQL Server, or “secondary” SQL Managed Instance, we don’t assume the internal replicas or standby nodes used internally for high availability. We mean separate logical Azure SQL Server or Azure SQL Managed Instance, that are hosted in a different region from the “primary” instance.

The Failover Group can only consist of two instances of SQL MI or two logical SQL Servers with Azure SQL Databases.

Data replication between servers in a Failover Group is asynchronous, so the primary doesn’t wait for the secondary instance to commit the data.

Important – In the event of a disaster, since replication is async, it is possible that secondary instance can miss a few seconds of transactions, that have been committed on primary. That’s why there is a minimum 1-hour grace period and automatic failover does not happen instantly – only after the grace period. Since we are dealing with possible data loss, Microsoft wants to bring back the primary region first, before failing over to a secondary region. It’s your decision whether to fail over to secondary region manually and not wait for Microsoft to bring the primary region online.

Failover Groups	SQL Offering	Supported
Failover Groups	Azure SQL DB	Yes
Failover Groups	Azure SQL MI	Yes
Failover Groups	SQL Server	no

Additional information:

Azure Site Recovery

Azure Site Recovery (ASR) is an infrastructure-level disaster recovery solution. It can replicate Azure VMs, on-premises VMs (Hyper-V, VMWare) or AWS virtual machines to Azure. It can be used to implement DR for Azure virtual machines with SQL Server installed.

Azure Site Recovery	SQL Offering	Supported
Azure Site Recovery	Azure SQL DB	no
Azure Site Recovery	Azure SQL MI	no
Azure Site Recovery	SQL Server on VM	Yes – VM level

ASR replicates the entire VM to a secondary region – including the OS, SQL Server binaries and data disks. SQL Server data will also be replicated – but it’s worth mentioning that ASR doesn’t understand SQL Server internals like transaction consistency or log states and no Always On Availability Group awareness. That’s where you need to understand the nuances.

ASR captures VM snapshots at the disk level. If SQL Server is actively writing during replication, you may not always get a transactionally consistent state at any point in time.

Considering above, it is still possible to use ASR as a DR solution for a SQL Server VM. Especially valuable is that it can replicate everything that is going on a VM. Microsoft recommends combining ASR with SQL-native disaster recovery technologies such as Always On availability groups, log shipping, backups to azure blob storage, and so on.

Additional information:

Azure Backup

Azure Backup is a feature that can be used as part of a disaster recovery solution for Azure virtual machines with SQL Server installed.

When enabled at the VM level, Azure Backup extension is installed on VM. It performs application-consistent backups using VSS (volume shadow copy service). Backups are stored in a recovery services vault, which is geo-redundant by default.

A full VM can be restored to a previous recovery point – normally these are daily snapshots or alternatively a new VM can be created from the backup.

In terms of SQL Server data, Azure backup obviously lacks point-in-time restore – in terms of recovering to ANY point in time within a particular time frame. But it can be recovered to a specific point in time when VM backup \ snapshot was made.

In short, Azure Backup can be used for VM-level disaster recovery, including the SQL Server’s data.

Azure Backup	SQL Offering	Supported
Azure Backup	Azure SQL DB	no
Azure Backup	Azure SQL MI	no
Azure Backup	SQL Server on VM	Yes – VM level

Additional information:

Feature Summary

Summary of the features can be found in the table below:

Feature	HA	DR	Azure SQL Database	Azure SQL Managed Instance	SQL Server on Azure VM
HA Through Redundancy	Yes	no	Yes	Yes	needs configuration
Zone Redundancy	Yes	no	Yes	Yes	can be achieved
Availability Set	Yes	no	no	no	Yes
Geo-Backups and Geo-Restore	no	Yes	Yes	Yes	Backup to URL
Active Geo-Replication	no	Yes	Yes	no	no
Failover Groups	no	Yes	Yes	Yes	no
Azure Site Recovery	no	Yes	no	no	Yes – VM level
Azure Backup	no	Yes	no	no	Yes – VM level

Recommendations for Azure SQL Database and Azure SQL Managed Instance

The general recommendation for Azure SQL Database and Azure SQL Managed Instance, is to enable “Zone-Redundancy” – if your region supports it. To increase the uptime SLA (to 99.995%) and strengthen high availability, that is already present in those solutions through built-in redundancy.

Occasionally, Microsoft will reboot the instance due to updates. Usually, this is a brief downtime of several seconds. You can’t choose exact date and time when SQL DB or SQL MI instances will be updated. But you can pick a maintenance window.

For disaster recovery, configure either “Active Geo-Replication” or “Failover Groups”, which also can be used for offloading of read-only queries.

Recommendations for SQL Server on Azure VM

For SQL Server on Azure VM (with premium SSDs), if 99.9% is not enough for your needs, SLA can be increased using two or more virtual machines – placed in either “Availability Set” (99.95%) or put into separate “Availability Zones” (99.99%) and then combined into Availability Groups.

I didn’t tell you this, but automated updates can be disabled / set to manual in all possible places, to increase the availability of the SQL VM even more. You can choose when and how to update and perform it manually – when it is most convenient for you. Availability groups also allow for rolling updates with manual planned failover, which can be especially useful.

For SQL VM disaster recovery, use either Availability Groups and add a SQL VM replica in a secondary region, or, if you have non-Enterprise edition – any other SQL Server’s DR feature (like Log Shipping, FCI, etc.).

On top of that, Azure Site Recovery and Azure Backup can be used to complement disaster recovery strategy by replicating or backing up data on a virtual machine level.

Conclusion

In this article, we reviewed and compared available solutions for High Availability and Disaster Recovery – for Azure SQL Database, SQL Managed Instance, and SQL Server on Azure VMs – that are specific to Azure platform only, providing high level details, use cases and pros/cons for each solution.

Next Steps

Check out these related tips:

Resources

Community

Subscribe