SQL Server VM Best Practices

By:   |   Comments (4)   |   Related: > Virtualization


Problem

I am building a virtual machine to run SQL Server and I need to know what the best practices are for such a machine.

Solution

The recorded presentation, Running SQL Server on Virtual Machines Best Practices, talks about many different settings related to building SQL Server VMs. This tip will cover many of those same settings in summary and try to answer some of the FAQ that came out of that session. Both the presentation and this tip are skewed heavily toward the Windows operating system.

Operating Systems

While it used to be common to have a difficult decision between editions of Windows Server, with newer versions of the operating system there isn't a significant reason to select one edition over another when building a SQL Server virtual machine. Both Windows 2019 Standard and Datacenter support 64 sockets and 24TB of RAM.

The big decision to be made with regards to the operating system is whether or not to include the full UI or to use the Windows "Core" installation. SQL Servers VMs should focus on serving clients of the SQL Server and not on running RDP sessions with SSMS or any other resource-hogging applications. As such, installing the full UI should be the exception rather than the rule.

If the full UI needs to be installed it would still be a best practice to limit the RDP sessions on the server. Do this by installing client tools like SSMS on workstations or jump boxes that are dedicated to such activities.

Does Microsoft still recommend rebooting SQL server monthly?

  • I reboot my SQL Servers only when they need to be patched. The often means monthly anyway, but if there were no patches I would not reboot.

Editions of SQL Server

The following information is accurate for SQL Server 2019 as of the publication date.

There are very significant differences between the 3 main editions of SQL Server. SQL Server Express is completely free, but is limited to referencing just under 1.4GB of RAM, 1 socket/4 cores of CPU, and a 10GB database. SQL Server Express also does not come with the SQL Server Agent Job scheduler.

SQL Server Standard Edition is the most common edition and can reference 128GB of RAM and 4 sockets/24 cores of CPU. It includes the SQL Server Agent and Basic Availability Groups. Basic Availability Groups are limited to exactly 2 nodes. Starting with SQL Server 2019 Transparent Data Encryption (TDE) became a Standard Edition feature after being Enterprise only since its inception in SQL Server 2008.

SQL Server Enterprise Edition is the most expensive with a list price over 3x higher than Standard Edition. Enterprise Edition removes the CPU and RAM constraints of Express and Standard. It is limited only by operating system limits. It includes a more robust set of HADR options (like Availability Groups with more than 2 nodes) along with several features designed to improve uptime such as online index rebuilds.

Both Standard and Enterprise Editions of SQL Server are most commonly licensed by the CPU core with a minimum of 4 core licenses required per server. A less common method to license Standard Edition is to license each server (regardless of cores) and purchase a client access license (CAL) for every client.

Question regarding Server-based licensing: What if there is only a web service account that is accessing the SQL Server instance and users are connecting via the application itself? Does each user still need to be licensed in SQL?

  • Yes. Even an indirect hit like this requires a CAL for server-based licensing. This is why server-based licensing is almost never the right choice for SQL Server supporting web sites and you should use core based licensing.

vCPUs

Pay attention to the sockets vs cores. SQL Server licensing has separate limits for cores and sockets so this is an easy misstep.

Since the most common method to license SQL Server is by the core, there is real value in spending the money to make sure the VM hosts have high-powered CPUs rather than CPUs that might have a lower cost, lower power consumption, or lower performance statistics. The money spent on the physical hardware might be able to be recouped with savings on the SQL Server licensing cost when the VM runs with less vCPUs.

If possible, get physical CPUs that have a large cache. Modern CPUs often have a power-saving mode which can be great in some scenarios, but not for SQL Server. Make sure all cores are running at full power rather than any sort of power-saver mode.

What if the vendor wants a bunch of CPUs but the specs seem like they are based on physical vs. vCPU?  Do you have a thought on assigning "too many" vCPUs to SQL?

  • If the vendor advice is old enough to be based on physical cores, then they are probably thinking about you having less powerful CPU cores than are commonly available on the market today. Since you generally pay for your SQL Server licenses by the core I would see if I could get by with less cores. In the virtual world you can always add more later if you need to. It is very hard to give vCPU cores back, especially if you already paid for the SQL Server licensing.

Memory

The biggest issue to consider regarding memory on virtual machines running SQL Server is that SQL Server does not like it when the amount of memory assigned to the operating system changes. Do not use Dynamic Memory on a SQL Server. Instead, get a full reservation of all the memory required.

Any opinion on the Lock Pages in Memory setting?

  • Personally, I don't use it for SQL Servers running on VMs. I used to enable the setting when I had physical boxes that had smaller amounts of memory where I felt a stronger need to protect what memory I had. Also, since I no longer let users RDP into my servers, I have less processes running that are likely to attempt to steal pages from the SQL Server buffer pool.

Disk

For this section the FAQ cover it pretty well. Jumping right in…

What is your opinion or recommendation of creating different LUNs for Data/TLOG/Temp?

  • Gone are the days where we need separate volumes for each type of SQL Server file. That was common when disks were internal to servers and different RAID choices were needed to support different workloads. It is still useful to have OS and other binaries on a separate logical volume from SQL Server data and log files.
  • The goal here is to get enough IOPS to get the performance we desire. Whether that is completed via many LUNs or one big LUN doesn't matter.

Should system database files live on the same logical volume as the SQL binaries or with the user databases?

  • I generally keep my system database files right next to my user database files. I would not recommend storing them on the OS volume – despite their small size.

If we can set up separate logical volumes to be optimized for different workloads, would it make sense to separate data/log/tempdb files?

  • If you are able to create different logical volumes that are optimized for different workloads then it makes sense to separate file by type onto these tuned disks.

Do you recommend formatting a Windows logical volume that is going to be used for SQL Server data/log files with a 64k block size?

  • Yes! I probably should have mentioned that in the presentation. Great catch!

What about RAID type for SQL Server for performance?

  • In most modern architectures the disk is supplied to the VMs from very large aggregates on SAN or NAS storage devices. These storage devices often use their own proprietary array technologies rather than traditional RAID numbers. If you are building an environment where you are using internal storage or otherwise allowed to select a RAID type then RAID 1+0 would be the best in most situations.

If you have multiple databases on an instance and one database has a large IO load on the server and some other databases combine to have a much smaller IO footprint - would it help to move this high IO database to its own disk?

  • Would the new volume come from the same disk aggregate or different disks? If it comes from the same aggregate that the first disk uses then not much has been accomplished as no new IOPS are likely to be made available to SQL Server. If the new volume would come from different disks and introduce new IOPS then it would surely help.

Backups

Backing up a SQL Server VM is an important step, but backing up the SQL Server data is of paramount importance. These 2 operations are not the same thing! Backing up the SQL Server data must be accomplished with database backups.

Virtual Machine snapshots or backups are NOT database backups.

SAN snapshots are NOT database backups.

These database backups can be accomplished with native SQL backups (full, differential, log) or by many 3rd party applications that are database aware.

Remember, if something goes wrong with a VM running SQL Server, a new, blank SQL Server VM could be rebuilt from a template in very short order. In that same situation restoring the SQL Server data without effective backups isn't going to happen.

Is my current backup utility database aware?

  • I got this question a lot and can't answer them all here. If you aren't sure, you can put it in the comments below and I'll try to answer everyone there.

Miscellaneous Questions

When I used VM for SQL Server 2008, neither hot add CPU nor hot add memory were possible without a restart. What might be the cause?

  • I'm no expert in such things, but it is my understanding that this is a function of the guest OS version and the options chosen during the initial build of the VM. Being back in the age of SQL Server 2008 it seems like it may well have been a limitation of the OS.

What would be causing Page Life Expectancy (PLE) to fluctuate so much throughout the day?

  • PLE is showing us the age of the data page in the buffer pool that has been there the longest. If some large activity, such as an index maintenance operation, needs a whole lot of data that isn't already in the buffer pool it will push out a whole bunch of data pages from the buffer pool to make room for its own operation. This can cause the PLE value to drop very quickly. After the large operation is over the buffer pool will start to reload with more commonly used data. Since the metric shows us the oldest page in the buffer pool it can only go up by 1 second per second. It will do this until the next time the data is purged from the pool and the value goes down again.

My buffer cache hit ratio is 5173. Why is it not just a 2-digit value like you were showing?

  • It would seem that I just got lucky to get a 2-digit number on my screen during that demo. I referred to Microsoft's books online which told me I could get that value from the dm_os_performance_counters DMV using the "Buffer cache hit ratio" row. Looking around online after getting your question suggested that in order to get the true ratio, I needed to divide that counter by the "Buffer cache hit ratio base" value which books online declares is "for internal use only". See below for the code that will determine the actual cache hit ratio.
DECLARE @base DECIMAL(10,1)ELECT @base = cntr_value
FROM sys.dm_os_performance_counters  
WHERE counter_name = 'Buffer cache hit ratio base'
  AND [OBJECT_NAME] = 'SQLServer:Buffer Manager';
 
SELECT cntr_value / @base * 100
FROM sys.dm_os_performance_counters 
WHERE counter_name = 'Buffer cache hit ratio'
  AND [OBJECT_NAME] = 'SQLServer:Buffer Manager';
Next Steps


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Eric Blinn Eric Blinn is the Sr. Data Architect for Squire Patton Boggs. He is also a SQL author and PASS Local Group leader.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips



Comments For This Article




Monday, August 15, 2022 - 12:15:35 PM - Patrick Murphy Back To Top (90382)
Based on recent and protracted experience in working with Microsoft engineers, there is one metric that trumps all others when building a SQL Server environment on Azure VM host. That is is Host IO. The points made in this article are correct if all of the data is contained inside the database or data is slowly changing. When large data movement is involved whether with SSIS, replication, backups or bulk operations the HOST IO will ultimately become the bottleneck. Overcoming this issue when compared to an on-prem datacenter server to get the same Host IO MS directed us to scale the VM size by a minimum of 4x...and recommended 8x for growth. Additionally, the standard 'Golden' images in Azure are 'Utility' images and are NOT tuned for SQL Server. This includes networking and disk optimizations for SQL Server. ...It was a painful journey to get a well-performing environment even with MS help and the use of the 'Iperf' tool. https://iperf.fr/

Friday, October 15, 2021 - 10:10:25 AM - Dirk Back To Top (89334)
Veeam is well able to backup a VM running SQL Server and is able to later to restore one or all databases from this VM image backup to another SQL Server. This part works perfectly well, we have been using (and testing) it for a quite long time now.

Veeam is also able to backup transaction logs of SQL Server databases (same is true for Oracle databases). I would suggest NOT to use this feature and backup transaction logs using other methods (e.g. by SQL Server agent, Ola Hallengren, ...) until Veeam manages to implement this feature in a usable way. Restore of transaction logs is currently limited to to a point in time between the previous VM backup that you are about to restore the database from and the next VM backup. e.g. being on a Sunday with VM backups each day, you will not be able to restore the VM backup of Tuesday and then restore all transaction logs from Tuesday to Sunday. You are restricted to restore the transaction log backups just until point in time of the VM backup of Wednesday.
This restriction makes the backup of the transaction logs using Veeam unusable as a means of desaster recovery in case something goes wrong with your data files.

One very important thing: when using Veeam to backup SQL Servers, make double sure that you have a working integrity check job, also do not forget to check its outcome. As Veeam does not read the data and transaction log files when backing up the VM but reads the underlying VM files, there is no way for it to find out about possible data file corruption!!
I can't stress this enough, it may save your butt!!

Tuesday, September 28, 2021 - 11:08:11 AM - Eric Back To Top (89288)
@Marsha

I looked at the Veeam documentation and it does appear to be able to do true application aware backups when set up properly. It probably makes sense to prove your configuration by restoring a SQL Server VM to a point in time in a low-stress test environment.

Monday, September 27, 2021 - 2:43:29 PM - Marsha Foss-Krumrey Back To Top (89285)
I am using Veeam for my backups for my SQL server. We do have the flag set for application aware.

Do I still need to do extra SQL backups in this case.














get free sql tips
agree to terms