General Purpose Tier for Azure SQL Database
Five years ago, there were a limited number of ways to deploy and use the SQL Server engine in the Azure Cloud. Today, there are many different services available.
How do you determine which service to use?
Data Architects should focus on key attributes of your data to pick the correct service. Certain services have limits on size, performance and movability of database. Other factors, such as the budget of your project, may also come into the decision-making process.
Transferring a database to the cloud is usually the easiest part of the project. Scheduling batch processes, executing maintenance jobs, and providing alerts is usually part of the holistic solution. For companies that have a large amount of ETL packages, a lift and shift of this code might be a consideration. Other companies are more security conscious which means auditing, compliance, and threat detection are key factors in the decision.
How can we deploy a reserved capacity database for a budget-oriented customer?
Microsoft announced the general availability of reserved capacity databases in August of 2018. This virtual core offering included both the General Purpose and Business Critical tiers. Today, we are going to focus on the General Purpose tier for Azure SQL database which is ideal for most business workloads. This offering is budget-oriented with scalable compute and storage options. The reduction in service tier cost comes with lower transaction throughput and higher latency numbers.
Please note that Azure Database managed instance is a different deployment model that provides native virtual networking integration (V-NET) and higher Transact SQL compatibility. Also, this model supports both virtual core offerings and will be discussed in a future tip.
Our boss has asked us to look into the new virtual core offerings of Azure SQL database. We have several on premises databases with low to medium workloads. The code within the databases complies with the ANSI standard of the structure query language (SQL). Therefore, migrating the databases to the cloud does not require managed instance.
We need to choose a simple problem that can be solved on our local laptop and in the cloud. This will give a performance comparison of the new offering. This proof of concept will use the math database schema which calculates prime numbers from 1 to N. In our case, we want to know how many primes numbers exist between 1 and 5 million. We want to execute 20 asynchronous jobs to accomplish this task as fast as possible.
The rest of the article will show you how to deploy an Azure SQL database (General Purpose edition) using the math database schema. Comparing cloud to on premise execution times allows us to gauge how fast the Azure virtualized environment is.
The Basic (DTU), Standard (DTU) and General Purpose (v-Core) service tiers use the same architectural design seen in the image below. Please see Microsoft documentation for full details. The standard availability model includes two layers: stateless compute and stateful data.
The stateless compute layer runs the database (sqlservr.exe) process and contains only transient and cached data. Data resides either on the local SSD or in memory. This stateless node is operated by Azure Service Fabric that initializes database engine, controls health of the node, and performs failover to another node if necessary.
The stateful data layer stores the database files in Azure Blob storage. Azure blob storage has built-in data availability and redundancy features. It guarantees that every record in the log file or every page in the data file will be preserved even if SQL Server process crashes.
A temporary outage can be caused by a database engine upgrade, an operating system upgrade, or unexpected failure. The Azure Service Fabric will move the stateless SQL Server process to another stateless compute node with free capacity. Data in Azure Blob storage is not affected by the move, and the data/log files are attached to the newly initialized SQL Server process. This process guarantees four 9’s of availability.
Heavy workloads may experience some performance degradation during the transition since the new SQL Server instance starts with cold cache. Applications should be designed for retrying database connectivity when an outage occurs.
In a nutshell, this architecture design has served very well for many years. The only difference between the Basic/Standard (DTU) and General Purpose (v-Core) tiers is the dedication of computing resources. Just remember that any shared cloud computing resource may encounter the noisy neighbor. Most cloud vendors have eliminated this issue by using intelligent resource management algorithms.
There are predominately two hardware generations in use for deploying Azure SQL database with dedicated cores.
The fourth generation Intel processors are based on the Broadwell architecture. Each database deployed with a GEN4 v-core is not hyper-threaded, has a larger memory to core ratio, uses standard solid-state drives, and does not have network acceleration. However, the total number of cores is capped at 24. Microsoft considers this offering as deprecated since this is an older chip set.
The table below shows the characteristics of the GEN4 offering. Please see the v-core documentation for details. There is another Microsoft documentation link for v-Core resource limits that states slightly different numbers. The IOPS numbers are based on the premium tier disks characteristics. Thus, larger databases are given more IOPS. This fact will come into play when we use the Azure SQL Managed Instance service that gives the Data Architect control over the files associated with the database.
|4||RAM||7 – 168 GB|
|4||MAX DB SIZE||4 TB|
The fifth generation Intel processors are based on the Haswell architecture. Each database deployed with a GEN5 v-core is hyper-threaded, has a lower memory to core ratio, uses standard solid-state drives, and has network acceleration. The total number of cores is capped at 80. This is the default v-Core offering in the Azure Portal.
Since taking screenshots for this article in June 2019, Microsoft has released two new processor offerings in preview. The Fsv2 series is compute optimized and the M series is memory optimized. The screen shot below shows the new offerings on an Azure Portal web page.
The table below shows the characteristics of the GEN5 offering. Please see the v-core documentation for details.
|5||RAM||5 – 408 GB|
|5||MAX DB SIZE||4 TB|
Azure Portal Deployment
Every Azure SQL database regardless of model or tier is associated with a logical server. The snippet shown below is a deployment of a new server named svr4tips2019 in the East US region. The allow azure services check box enables other applications like Data Factory to connect to the database. To finish filling out the new server dialog box, please supply a known server admin login and a random password.
The Azure Portal allows the data platform designer to pin objects to a custom dashboard. I like using this technique since objects such as a virtual machine have many components. Usually, I am only interested in the topmost object. The image below shows the logical database server being added to the default dashboard.
Once a logical server is defined, there are many ways to deploy databases. For now, we will complete the deployment of the three databases using the portal.
The image below shows the database configuration for an Azure SQL database using a provisioned GEN4 computing tier and four virtual codes. Please complete the deployment for a database with the name of db4tips2019gpxgen4.
The image below shows the database configuration for an Azure SQL database using a provisioned GEN5 computing tier and four virtual codes. Please complete the deployment for a database with the name of db4tips2019gpxgen5.
The image below shows the database configuration for an Azure SQL database using a serverless GEN5 computing tier and four virtual codes. Please complete the deployment for a database with the name of db4tips2019svrless.
The image below shows the three databases pinned to the dashboard.
At this point one can create the test MATH database schema and execute the stored procedure. However, I want to explore different ways Azure databases and servers can be deployed. Please right click each database on the dashboard and choose the delete option. In the next section we will review how Microsoft Transact SQL can be used to deploy databases.
Every Azure SQL database is protected from intrusion with a firewall. Please make sure that the IP for your local laptop has been added to the logical server.
I am still using SQL Server Management Studio (SSMS) for DBA tasks. Please make sure you have the latest version of the software installed on your local laptop.
The image below shows a typical login using the account I created during deployment of the logical server.
At this point, the object explorer shows a logical server with no databases.
The CREATE DATABASE syntax has been modified with new edition and service objectives. The code below deploys a provisioned generation 4 computing database. Execute this code from the master database. Repeat this process for both a serverless and provisioned computing databases using generation 5 hardware.
-- -- Given a logical server, create three databases -- -- Drop GEN4 dbms DROP DATABASE IF EXISTS [db4tips2019gpxgen4] GO -- Create GEN4 dbms - 4 core CREATE DATABASE [db4tips2019gpxgen4] ( EDITION = 'GeneralPurpose', SERVICE_OBJECTIVE = 'GP_GEN4_4' ); GO -- Drop GEN5 dbms DROP DATABASE IF EXISTS [db4tips2019gpxgen5] GO -- Create GEN5 dbms - 4 core CREATE DATABASE [db4tips2019gpxgen5] ( EDITION = 'GeneralPurpose', SERVICE_OBJECTIVE = 'GP_GEN5_4' ); GO -- Drop SVRLESS dbms DROP DATABASE IF EXISTS [db4tips2019svrless] GO -- Create SVRLESS dbms - 4 core CREATE DATABASE [db4tips2019svrless] ( EDITION = 'GeneralPurpose', SERVICE_OBJECTIVE = 'GP_S_Gen5_4' ); GO
Make sure your refresh the database node in the object explorer. The image below shows all three database types being successfully deployed.
Again, right click each database and choose the drop database action. This same action can be performed by executing a DROP DATABASE T-SQL statement from the master database.
Azure Cloud Shell
The Azure Cloud Shell is a browser-accessible shell for managing Azure resources. It provides the flexibility of choosing the shell experience that best suits the way you work, either Bash or PowerShell. The last deployment method will use the Azure Command Line syntax to deploy objects. This object is very close in syntax to the PowerShell cmdlets. Please see Microsoft documentation for details.
To launch an Azure Cloud Shell session, choose the icon that looks like a PowerShell prompt from the Azure Portal main screen. The first execution of the shell for a particular user will ask which type of environment that you want installed.
Cloud Shell machines are temporary, but your files are persisted through a mounted file share. The first launch of the shell prompts the user to create a resource group, storage account, and file share on your behalf. This one-time step will be automatically attached for all future sessions.
The image below shows the successful creation of the cloud drive. At this point, you can start typing in commands to execute at the prompt.
A question that one might have is why use the Azure Cloud Shell?
The Microsoft on premises PowerShell cmdlets have been changing over the last 10 years. First, there was the classic deployment model. Second, the resource manager deployment model introduced the idea of resource groups. See this article for details. Today, there is the cross-platform model that allows execution of scripts on both Windows and Linux. The last model was released in December of 2018 and contains the abbreviation Az for shortness. This transition of models makes it difficult to keep machines up to date with the latest modules. The Azure Cloud shell has the latest modules and many of the new service offerings release deployment code for the Azure Command Line Interface (CLI) first.
The "az sql server list" command shows any logical servers deployed in a particular resource group. Our server named svr4tips2019 is shown in the image below. The next two images have been modified with MS paint to keep the original content, but to eliminate space. Therefore, wrapping of commands or text might occur.
The Azure Command Line Interface is extremely powerful. We can manage all aspects of an Azure SQL deployment. The "az sql db create" command can be used to deploy a provisioned v-core database. The image below shows the deployment of the db4tips19gpxgen5 database. I only include the first 25 lines of output. However, all the nitty gritty details of the database are listed for review.
The Azure CLI commands below can be used to deploy all three database types.
-- -- Given a logical server, create three databases -- # List servers in group az sql server list -g rg4tips2019 # List databases given server and group az sql db list --server svr4tips2019 --resource-group rg4tips2019 # Create gen5 database az sql db create -g rg4tips2019 -s svr4tips2019 -n db4tips19gpxgen5 -e GeneralPurpose -f Gen5 -c 4 --compute-model Provisioned # Create gen4 database az sql db create -g rg4tips2019 -s svr4tips2019 -n db4tips19gpxgen4 -e GeneralPurpose -f Gen4 -c 4 --compute-model Provisioned # Create serverless database az sql db create -g rg4tips2019 -s svr4tips2019 -n db4tips19svrless -e GeneralPurpose -f Gen5 -c 4 --compute-model Serverless --auto-pause-delay 120 # Exit the shell exit
I will not go over the details on how to calculate the prime numbers from 1 to 5 million using 20 different batch command shells running T-SQL in parallel. Please see my prior article that has the details for that task. This test (stored procedure) is a computationally heavy operation. We want to create a database schema and execute the tests for the various database configurations shown in the images below. All tests are using four virtual cores for computing power.
The first image shows the local execution of the test on a laptop running Windows 10 as an operating system and SQL Server 2016 as the database engine.
The second image shows the results of the test using a provisioned Azure SQL database using the GEN4 computing.
The third image shows the results of the test using a provisioned Azure SQL database using the GEN5 computing.
The fourth image shows the results of the test using a serverless Azure SQL database using the GEN5 computing.
The fifth image shows the results of the test using a provisioned Azure SQL managed instance using the GEN4 computing.
The sixth image shows the results of the test using a provisioned Azure SQL managed instance using the GEN5 computing.
The table below shows a summary of all the testing that was performed.
|Test No||Description||Configuration||Compute (4 cores)||Time|
|1||Local Laptop||WIN 10 / SQL DB 16||GEN6||235|
|2||Provisioned General Purpose||AZURE SQL DB||GEN4||363|
|3||Provisioned General Purpose||AZURE SQL DB||GEN5||426|
|4||Serverless General Purpose||AZURE SQL DB||GEN5||437|
|5||Provisioned General Purpose||AZURE SQL MI||GEN4||307|
|6||Provisioned General Purpose||AZURE SQL MI||GEN5||387|
Regardless of compute generation, the Azure Managed Instance offering is almost 1 minute faster than the Azure SQL Database offering. Of course, the on-premises hardware came in as number one on the list. Last but not least, the serverless database took an extra 10 seconds to execute since the database might have been in a paused or sleep mode.
Price vs Performance
Many projects sponsored by the business line are on a tight budget. Therefore, presenting different options to the business line allows for an informed decision. We will be curious about how price correlates to performance.
It has been decided that a 20 to 24 core machine is needed for our business work load. We want to have the ability to store up to 4 TB of data. The DS15 virtual machine has twenty core and 160 GB of memory. The Azure SQL database and Azure SQL Managed Instance offerings come with 24 cores and 122 GB of memory. We decided to use GEN5 computing since the network IO is accelerated.
The chart below shows some sample prices pulled from the Azure pricing calculator.
|1||Virtual Machine||DS15 - SQL DB 19||GEN6||7413|
|2||Provisioned General Purpose||AZURE SQL DB||GEN5||4889|
|3||Provisioned General Purpose||AZURE SQL MI||GEN5||4885|
There is a handful of dollars difference between the two Platform As A Service (PAAS) offerings. The cost quoted above is for pay-as-you-go without any discounts. There are pricing discounts for Azure Hybrid Benefit (AHB) and Reserved Instance (RI). Consult your local Microsoft account representative for details.
We did not perform any testing with Infrastructure As A Service (IAAS) represented by the virtual machine. Also, the price of the offering is around $2500 greater than the rest per month.
It would be wise to point out the fact that Azure Managed Instance performed better than the rest with our current workload. I would repeat the workload test a couple more times to make sure that a consistent result is achieved. If the numbers do not change, then the business client should choose Azure Managed Instance as the target platform.
The Azure Cloud is a complex ecosystem that is continuously changing. The Azure SQL service might have only three tiers – (basic, standard and premium) – in the recent past. Today, there are a variety of offerings that a data architect can choose. The use of researching the characteristics of the offerings and comparing performance baselines is key in selecting the correct service and tier to deploy.
The reserved capacity databases have two intel hardware generations that one can choose from. Unless there is a great performance difference between the two, it is wise to choose the GEN5 offering. Microsoft will probably be eliminating the GEN4 systems over time due to age.
The General Purpose tier is ideal for most business workloads. This offering is budget-oriented with scalable compute and storage options. Even though I have not yet written about the Managed Instance offering, it was included in the performance testing.
One limiting factor of this General Purpose offering is the number of data IOPS is capped at 7,500. This number relates to the premium storage disk used in Azure Blob Storage. With all testing parameters being equal, the managed instance offering was quicker than the Azure SQL database offering.
Why is this happening?
One potential problem is the log I/O rate increasing slowly while the data I/O rate rapidly. What I do not like about this documentation is that log rate is in MBPS and data rate is in IOPS. The calculation from MBPS to IOPS requires a known KB per I/O. However, an estimate of 8 to 64 is shown in the documentation. It is hard to create log rate in IOPS given a random variable. This non proportional growth trend occurs in both the General Purpose and Business Critical offering.
In summary, there are only a few choices that one can make when choosing dedicated computing capacity. This includes General Purpose, Business Critical, and Hyper Scale. Today’s investigation of the General Purpose tier reviewed three ways to deploy a logical server and database; compared a given work load between different deployment configurations; and used the pricing calculator to figure out a monthly cost. In short, the General Purpose tier has a reduced total cost of ownership compared to a virtual machine. As an added bonus, high availability and backups are part of the service.
Next time, I will be talking about how to compute faster with the Business Critical tier. Of course, this offering is a lot more expensive.
- Business critical version of Azure SQL Database
- Going hyperscale with Azure SQL Database
- Why choose Managed Instance over other SQL offerings?
- Deploying Azure SQL Databases using the new Az Module
- Executing SSIS packages with Azure SQL Database
About the author
View all my tips