By: John Miner | Comments (1) | Related: > Azure SQL Database
Problem
Because Azure SQL Database is a Platform As A Service (PAAS), backups for business continuity planning (BCP) are automatically taken. A full backup is taken on a weekly basis with a differential backup is taken on a hourly-basis. To round out the backup chain, a transaction log backup is taken every 5 minutes. Please see the Azure documentation on BCP for full details by service tier. For a simple outage, we will lose at most 5 minutes of data. However, all these backups are stored on Local Redundant Storage (LRS) in the current data center (location). What can we do if a real disaster strikes our data center?
Both data page backups, full and differential, are copied to a paired data center in another geographical location or Geo Redundant Storage (GRS). Therefore, we can perform a Geo-Restore to another data center to bring our database on line. However, we might lose one hour of database transactions. If your application is a public website that takes internet sales orders, can your company really afford losing that amount of data?
How can we reduce the amount of data loss and have one or more hot secondary servers ready for failover?
Solution
Active Geo-Replication allows an Azure SQL database to have up to four secondary copies of a primary database. At most, each secondary database is less than 5 seconds behind the primary database. Thus, this feature meets our high availability requirements. Microsoft has provided the database developer with both Transact SQL and PowerShell commands to enable this feature.
Business Problem
Our boss has asked us to investigate the steps involved in leveraging Active Geo-Replication for our Azure SQL database. We are going to implement the following proof of concept (solution).
- Make 5 resource groups in different geographical regions.
- Create a logical SQL Server in each resource group.
- Craft a firewall rule for a laptop to access to each Server.
- Deploy the MATH database to the primary server.
- Calculate and store prime numbers between 1 and 250K.
- Use T-SQL to enable an active geo-replication link.
- Monitor the geo-replication link with T-SQL.
- Use T-SQL to fail over to secondary database.
- Use T-SQL to remove geo-replication link.
- Establish geo-replication using PowerShell.
- Use PowerShell to fail over to secondary database.
- Validate that secondary database is read-only.
- Seed all four secondary databases from primary database.
- Validate secondary databases < 5 seconds behind primary database.
- Use PowerShell to remove the geo-replication link.
The above steps will give us a good understanding of how Active Geo-Replication works. Both Transact-SQL and PowerShell will be used in the examples during our exploration. In the future, choose the language that you are most familiar with for daily use.
Azure Subscription
We must log in with a valid subscription owner before we can do any work. The Add-AzureRmAccount cmdlet will prompt you for credentials.
An account might have several subscriptions associated with it. Use the Get-AzureRmSubscription cmdlet to list the subscriptions associated with [email protected].
The output from this command shows two subscriptions. Since the Developer Program Benefit only has $25 associated with it, we want to make sure any new deployments are on the Visual Studio Enterprise subscription since it has a higher spending limit.
Last but not least, use the Set-AzureRmContext cmdlet to choose the correct subscription.
The PowerShell script below logs into the portal, lists the subscriptions and selects the correct subscription. Of course, manual intervention is needed for the sign on and coding the correct subscription id ahead of time.
# # Azure Subscriptions # # Prompts you for azure credentials Add-AzureRmAccount # List my subscriptions Get-AzureRmSubscription # Pick my Developer one $SubscriptionId = '26efaaba-5054-4f31-b922-84ab9eff218e' Set-AzureRmContext -SubscriptionId $SubscriptionId
Azure Data Centers
There are currently 30 different regions (data centers) with the plans for 8 more in the works. See this web page for full details. Not all regions support all services provide by Microsoft Azure. Use the Get-AzureRmResourceProvider cmdlet to get details for each region.
The PowerShell script below looks for all regions in the United Status that support Azure SQL database.
# # Data Centers in US with my service # # Clear the screen Clear-Host # Data centers with Azure SQL database $DataCenter = Get-AzureRmResourceProvider -ListAvailable $HasSql = $DataCenter | Where-Object {$_.ProviderNamespace -eq 'Microsoft.Sql'} ` | Select-Object -ExpandProperty Locations | Select-String -Pattern "US" -CaseSensitive $HasSql | Sort
The image below shows the eight regions in the United States.
There is an important concept called paired regions which comes into play when using Transact-SQL (T-SQL) for Active Geo-Replication. Basically, each data center is paired with another data center in the same geography. For instance, "East US" region is paired with "West US" region. Thus, all backups used in a geo-restore of a database in the "East US" region are stored in the "West US" region.
Azure Resource Groups
I am going to assume you are familiar with the Resource Manager PowerShell cmdlets for Azure SQL Database. If you are not, please see my article on this subject.
We will want to create a resource group, logical SQL Server, and firewall rule in each of the following regions.
- East US
- West US
- UK South
- Australia East
- Japan West
The enclosed script creates the required Azure objects listed above.
The PowerShell script below finds the new resource groups. Since I used a standard prefix during naming, it was easy to find these objects.
# # Show all 5 resource groups # # Clear the screen Clear-Host # List resource groups Get-AzureRmResourceGroup | Where-Object {$_.ResourceGroupName -like "rg4poc*"} ` | Select-Object ResourceGroupName, Location
The output below from the PowerShell Integrated Script Environment (ISE) shows the five new resource groups.
We can search and filter for the same objects using the Azure Portal.
I wanted to introduce a new concept that I have not used before in Azure. Tags can be added to most objects that you create. Each tag is a name value pair. These tags are very useful when the number of objects in your subscription becomes large.
The PowerShell script below adds a tag to the new resource group. In our case, this Project is for MsSqlTips.
# New resource group New-AzureRmResourceGroup -Name "rg4poc1east" -Location "East US" -Tag @{Project="MsSqlTips"}
The image below was taken from the Azure Portal and shows the tag information on the rg4poc1east resource group.
Azure SQL Database
Each resource group should contain a logical SQL Server and eventually a geo-replicated copy of our MATH database. I am assuming you already created the five servers.
The PowerShell script below finds these new servers. Again, it was easy to find these objects due to standard naming conventions.
# # Show all 5 sql servers # # Clear the screen Clear-Host # List sql servers $List = Get-AzureRmResourceGroup | Where-Object {$_.ResourceGroupName -like "rg4poc*"} $List | Get-AzureRmSqlServer | Select-Object ResourceGroupName, ServerName, Location | FL
The output below from the PowerShell Integrated Script Environment (ISE) shows the five new SQL Servers.
We can search and filter for the same objects using the Azure Portal.
To drive home the point, the PowerShell script below adds a tag to the new logical server. In our case, this Project is for MsSqlTips.
# New sql server New-AzureRmSqlServer -ResourceGroupName "rg4poc1east" -ServerName "sql4poc1east" ` -Location "East US" -ServerVersion "12.0" -SqlAdministratorCredentials $Credential ` -Tags @{Project="MsSqlTips"}
The image below was taken from the Azure Portal and shows the tag information on the rg4poc1east resource group.
Azure Firewall Rule
There is nothing new to mention here. You can use the Azure Portal, SQL Server Management Studio or PowerShell to add your IP to the white list. Just remember, if you use contained database users and local database firewall rules, these rules move when the database is replicated. Less management overhead. On the other hand, you can just make sure the server level settings are kept in synch on all logical servers.
The image below shows the server level firewall rule captured in the Azure Portal.
User defined database
The sql4poc1east server is considered our primary server located in the East US location. Our first step is to log into this server using SQL Server Management Studio.
The MATH database contains the prime numbers from 2 to N. The value of N is constantly increasing each time the scheduled job is executed. The TBL_CONTROL_CARD table contains the last number checked for being prime. The SP_IS_PRIME stored procedure uses a brute force division algorithm to determine if a number is prime. Any positive results are stored in the TBL_PRIMES table. Last but not least, the SP_STORE_PRIMES stored procedure checks numbers X to Y for primeness.
The image below shows the objects that make up the MATH database. Use the following T-SQL script to create this database on our primary server.
User defined job
The Azure SQL database service does not support user defined jobs like the on-premise database engine. There is no SQL Server Agent offering as a service. However, there are many work arounds for this issue. To name a few, windows scheduler that invokes a sqlcmd file, batch automation jobs with a PowerShell workflow, or elastic database jobs with a native Transact-SQL script. The last option was announced at Ignite 2016 and is still in private preview.
The purpose of this user defined job is to add data to the MATH database. The job calculates prime numbers in batches of 250,000 numbers at a time. The T-SQL code below is the only step in the job. For our testing, the code can be executed as a stand alone script. During this article, running the user defined job means you should execute this script. We can even throw in a TRUNCATE TABLE to test if minimally logged DELETE statement works correctly with geo-replication.
/* Sql Job -- Math Fun: Calculate Primer Numbers */ -- Calculate prime numbers DECLARE @THE_LOW_VALUE [BIGINT]; DECLARE @THE_HIGH_VALUE [BIGINT]; -- Low & High water marks SELECT @THE_LOW_VALUE = [MY_VALUE] FROM [DBO].[TBL_CONTROL_CARD]; SELECT @THE_HIGH_VALUE = @THE_LOW_VALUE + 250000 - 1; -- Start the process BEGIN TRANSACTION -- Run the calculation EXEC SP_STORE_PRIMES @THE_LOW_VALUE, @THE_HIGH_VALUE; UPDATE [DBO].[TBL_CONTROL_CARD] SET [MY_VALUE] = @THE_HIGH_VALUE + 1; -- End the process COMMIT TRANSACTION; GO
Let's execute the above script to add data to the MATH database.
Create Active Geo-Replication (T-SQL)
The ALTER DATABASE statement has been modified to support Geo-Replication for Azure SQL databases. The ADD SECONDARY ON SERVER clause identifies a named server located in a paired region as a geo-replication partner.
I told you that pair regions were important!
However, there is another option that is equally import. The ALLOW_CONNECTIONS option can be set to two values. The ALL value creates an Active Geo-Replication database that is read only. On the other hand, the NONE value creates a Standard Geo-Replication database that can't be read, but is ready for failover. This feature is depricated and all existing secondary databases will be updated to Active Geo-Replication in April 2017.
Execute the Transact-SQL script below from our primary server named sql4poc1east, which is located in East US, to create a MATH database on our secondary server named sql4poc2west, which is located in West US. Please note, I am numbering the servers since we will have one primary database (server) and four secondary databases (servers) in our final configuration.
-- -- Create secondary (west us) -- -- Execute from master ALTER DATABASE MATH ADD SECONDARY ON SERVER sql4poc2west WITH ( ALLOW_CONNECTIONS = ALL ); GO
Monitor Active Geo-Replication (T-SQL)
There are two important system views, located in the master database, that you should be aware of.
The sys.geo_replication_links view contains a row for each geo-replication link between the primary and secondary databases (servers). Check out the replication_state_desc field for the status of the secondary database. It can be in a PENDING state - waiting to create a secondary database, SEEDING state - performing a snapshot copy of database from primary to secondary server, and CATCH_UP state - both databases are transactionally consistent with a few seconds of delay between the two.
The sys.dm_operation_status view tracks the following operations at the server level.
- Create database
- Copy database
- Alter database
- Change service tier
- Create geo-replication link
- Drop geo-replication link
- Restore database
- Delete database
Again, check the state_desc field for the status of each operation. The status can be in a PENDING state - operation is waiting for resources, IN_PROGRESS state - operation is executing, COMPLETED state - operation completed successfully, FAILED state - operation failed and CANCELLED state - operation stopped by user.
-- -- Geo replication link -- select database_id, start_date, modify_date, partner_server, partner_database, replication_state, replication_state_desc, role_desc, secondary_allow_connections_desc from sys.geo_replication_links; go -- -- Server operation status -- select major_resource_id, operation, state, state_desc, percent_complete, start_time, last_modify_time from sys.dm_operation_status; go
Use SQL Server Management Studio to execute the above statements against the master database located on the sql4poc1east server. In the output below, we can see that we have one geo-replication link in which the current server is in the PRIMARY role. Two major server operations have occurred: the creation of the MATH database and the geo-replication of that database to the SECONDARY server.
Execute the above statements against the master database located on the sql4poc2west server. We can see in the below image that this server performs a SECONDARY role in a geo-replication link (partnership). The only server operation that occurred was the seeding of the MATH database from the PRIMARY server.
Now, let us execute the user defined job to find prime numbers between 250K to 500K. Afterwards, execute the Transact-SQL below to review the last five prime numbers that were found.
-- -- Last five rows -- SELECT TOP (5) [MY_VALUE], [MY_DIVISION], [MY_TIME] FROM [dbo].[TBL_PRIMES] ORDER BY [MY_VALUE] DESC
The image below shows that the newest prime numbers have been replicated to the MATH database on the sql4poc2west server.
Is there a way to get a status on how far replication is lagging behind on the secondary server?
The Transact SQL script below provides an answer to this question.
-- -- Geo replication link status -- select partner_server, partner_database, replication_state, replication_state_desc, role_desc, secondary_allow_connections_desc, last_replication, replication_lag_sec from sys.dm_geo_replication_link_status go
The sys.dm_geo_replication_link_status view tracks replication information at the database level. The last_replication and replication_lag_sec fields are of interested since they tell you how behind the SECONDARY database (server) is from the PRIMARY database (server).
The image below depicts the replication link status of the MATH database on the sql4poc2west server.
Active Geo-Replication Failover (T-SQL)
Again, we are going to use the ALTER DATABASE statement to failover the primary database (server) to our secondary database (server).
There are two options that we can use. The FAILOVER option gracefully flushes database transactions and promotes the SECONDARY to PRIMARY. In contrast, the FORCE_FAILOVER_ALLOW_DATA_LOSS promotes the SECONDARY to PRIMARY right away. Use this option when your PRIMARY database (server) is problematic or unreachable.
Just use SQL Server Management Studio to connect to the sql4poc2west server and execute the following query from the master database to cause a failover.
-- -- Promote to primary (west us) -- ALTER DATABASE MATH FAILOVER; GO
The image below was taken from the Azure Portal and shows the sql4poc2west server in the PRIMARY role.
The image below was taken from the Azure Portal and shows the sql4poc1east server in the SECONDARY role.
Remove Active Geo-Replication (T-SQL)
Finally, we are going to use the ALTER DATABASE statement to remove the geo-replication link. This action leaves the secondary database in-tack, but no longer in synch with the primary database.
Use SSMS to connect to the sql4poc2west server and execute the following query from the master database.
-- -- Drop secondary (east us) -- ALTER DATABASE MATH REMOVE SECONDARY ON SERVER sql4poc1east; GO
The image below was taken from the Azure Portal and shows the sql4poc2west server not participating in any geo-replication roles.
In a nutshell, Transact SQL can be used to create, monitor, alter (failover) and remove Active Geo-Replication for Azure SQL databases. In the next few paragraphs, we are going to discover how PowerShell can do these same actions with out any restrictions such as paired regions.
Create Active Geo-Replication (PowerShell)
Using PowerShell to create an Active Geo-Replication link is a two step process.
First, we can get the database context of the primary database by using the Get-AzureRmSqlDatabase cmdlet. This object is piped to the New-AzureRmSqlDatabaseSecondary cmdlet to start geo-replication of the primary database to the secondary server. Unlike Transact-SQL, any two locations specified by resources groups can be used as long as both locations support Azure SQL database. Thus, we can have a database in "East US" replicated to "UK South".
The PowerShell snippet below sets up geo-replication of the MATH database between sql4poc1east server in East US and sql4poc2west server in West US.
# # Create active geo replication (east -> west) # $pri_db = Get-AzureRmSqlDatabase –DatabaseName "MATH" -ResourceGroupName "rg4poc1east" ` -ServerName "sql4poc1east" $sec_db = $pri_db | ` New-AzureRmSqlDatabaseSecondary –PartnerResourceGroupName "rg4poc2west" ` –PartnerServerName "sql4poc2west" -AllowConnections "All"
The image below shows the output from our execution. An unexpected error has occurred.
Can you guess what is wrong?
When replication is started from the PRIMARY to SECONDARY server, a snapshot copy of the database is transferred. This is called SEEDING the secondary database. However, in our case the MATH database already exists on the server. Thus, the PowerShell script generates an error. Just delete the MATH database from the sql4poc2west server and try again.
You should see the following output if things go well.
First screen capture from the PowerShell ISE application shows a new MATH database on the SECONDARY server named sql4poc2west in the West US region.
Monitor Active Geo-Replication (PowerShell)
All monitoring should be done via system views available to Transact-SQL.
The only cmdlet available in PowerShell is Get-AzureRmSqlDatabaseReplicationLink which returns the same information you see when a Active Geo-Replication link is created. Please see above image for output for the below code.
# # What is the status of the link (east, west) # $pri_db = Get-AzureRmSqlDatabase –DatabaseName "MATH" -ResourceGroupName "rg4poc1east" ` -ServerName "sql4poc1east" $sec_db = $pri_db | Get-AzureRmSqlDatabaseReplicationLink ` –PartnerResourceGroup "rg4poc2west” –PartnerServerName "sql4poc2west” $sec_db
Active Geo-Replication Failover (PowerShell)
The command to fail over from the primary database (server) to a secondary database (server) is similar to previous calls.
First, we can get the database context of the secondary database by using the Get-AzureRmSqlDatabase cmdlet. This object is piped to the Set-AzureRmSqlDatabaseSecondary cmdlet to perform the failover. Use the -AllowDataLoss switch when the PRIMARY database (server) is problematic or unreachable.
# # Cause failover from east to west # $sec_db = Get-AzureRmSqlDatabase –DatabaseName "MATH" -ResourceGroupName "rg4poc2west" ` -ServerName "sql4poc2west" $pri_db = $sec_db | Set-AzureRmSqlDatabaseSecondary -PartnerResourceGroupName "rg4poc1east" ` -Failover -AllowDataLoss
Making database changes
A developer might ask "What type of database changes can I make when Active Geo-Replication is enabled?".
One attractive feature of Azure SQL Database (Platform As A Service) is the ability to scale the database processing power up or down depending upon need. This can still be done. However, caution needs to be taken on the database order. Since the PRIMARY database is handling the amount of transactions (data flow), it should be scaled down first and up last. The SECONDARY databases are just the opposite since they should have at least the same amount of processing power as the PRIMARY database. If these rules are not followed, a PRIMARY database can be processing more transactions that a SECONDARY databases can handle. This will result in increased lag times on replication. In turn, the amount of data loss during a disaster can increase.
The image below was taken from the Azure Portal and shows the MATH database located on the sql4poc2west server being manually resized from tier S2 Standard to S0 Standard.
If you like PowerShell, the following script resizes the MATH database located on the sql4poc2west server from tier S2 Standard to S0 Standard.
# # Change the database size (s2 -> s0) # Set-AzureRmSqlDatabase -ResourceGroupName "rg4poc2west" -ServerName "sql4poc2west" ` -DatabaseName "MATH" -Edition "Standard" -RequestedServiceObjectiveName "S0"
How about making changes specified by data manipulation language (DML) or data definition language (DDL) statements?
The PRIMARY database can process any valid DML or DDL statements. In contrast, the SECONDARY database is read-only and can only handle SELECT queries. The image below shows that both the DML and DDL statements fail when executed against the read-only MATH database located on the sql4poc2west server.
Maximum Number Of Secondaries
One Azure SQL Database can have up to four secondary databases when Active Geo-Replication is enabled. Right now, we only have one secondary database. We will modify and execute the create active geo-replication PowerShell script above. For each of resource group and location, we will have a MATH database. See the results of our labor below.
Second screen capture from the PowerShell ISE application shows a new MATH database on the SECONDARY server named sql4poc3uk in the UK South region.
Third screen capture from the PowerShell ISE application shows a new MATH database on the SECONDARY server named sql4poc4aus in the Australia East region.
Fourth screen capture from the PowerShell ISE application shows a new MATH database on the SECONDARY server named sql4poc5jap in the Japan West region.
Testing replication lag time
With SQL Server Management Studio, we can have an active connection to all five databases (servers). The query below is showing data from the Australian data center. At this point, the TBL_PRIMES table contains all prime numbers between 1 and 500K.
One might wonder if the replication lag time increases past 5 seconds as the number of replicas hits a maximum value?
For this test, I truncated the TBL_PRIMES table and ran the database job one time. These T-SQL actions were executed on the primary database located on sql4poc1east server. Thus, we have calculated all prime numbers between 1 and 250K. Around 22 thousand records were replicated to all secondary databases in four different data centers (regions). The image below shows the last transaction occurred at 02:22:42.743.
If we query the sys.dm_geo_replication_link_status system view, we can see the last replicated transaction occurred at 02:22:44.463 and was sent to the database in the Japanese data center. The lag time between all replication links is less than two seconds.
Remove Active Geo-Replication (PowerShell)
To be code complete, we need a routine that can remove the Active Geo-Replication link between the PRIMARY and SECONDARY databases (servers). The PowerShell script below breaks the link between the sql4poc1east and sql4poc2west servers for the MATH database.
# # Remove the active geo replication (west) # $pri_db = Get-AzureRmSqlDatabase –DatabaseName "MATH" -ResourceGroupName "rg4poc1east" ` -ServerName "sql4poc1east" $sec_db = $pri_db | Get-AzureRmSqlDatabaseReplicationLink ` –PartnerResourceGroupName "rg4poc2west" –PartnerServerName "sql4poc2west" $sec_db | Remove-AzureRmSqlDatabaseSecondary
The enclosed script exercises most the of Geo-Replication commands used in this article. I am including this information so that you can perform the same proof of concept that I did.
Summary
Today, we learned that Azure SQL Database has a Recovery Point Object of 5 minutes for simple data center outages and 1 hour for complete data center disasters. The first number is the frequency of transaction backups on LRS and the second number is the frequency of full or differential backups on GSR. If your application is public website that takes internet sales orders, can your company really afford losing that amount of data?
We discovered that Active Geo-Replication can be used to create a readable hot secondary database (server) ready for failover which is at most 5 seconds behind the primary in terms of transactions. While this doubles our cost by adding a second database, it is more than justified for tier 1 applications that need to be highly available. Microsoft has provided the database developer with both Transact-SQL and PowerShell cmdlets to enable and manage this feature. System views at the server and database level can be used to monitor the performance of the Active Geo-Replication.
In a nutshell, use Active Geo-Replication for your important Azure SQL databases. If you want to learn more, check out the Azure Documentation on this subject.
Next Steps
- How can we run Transact-SQL on a schedule against an Azure SQL database? Batch automation jobs are one way to solve this problem.
- Currently, failover to the secondary database (server) is a manual process. Can we create a PowerShell heart beat script that does this work for us?
- Check out these other Azure tips
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips