Array Based Snapshots for the SQL Server DBA
Using SQL Server backup files is a tried-and-true approach for restoring and recovering databases in planned or unplanned scenarios, refreshing data in non-production environments such as development or test, and seeding data for reporting. However, there is another method that could enable those scenarios: snapshots.
The concept of a snapshot exists in both hardware and software. No matter where one is generated, a snapshot is a static image captured at a point in time of data that is created nearly instantaneously. In this article, you will learn about the different types of snapshots and why every DBA should consider them for their tool arsenal.
Types of Snapshots
There are different types of snapshots: hardware-based and software-based snapshots.
Array-based snapshots are generated at the hardware storage layer through native or third-party utilities. This type of snapshot represents a volume at a single point in time and points back to the original volume; it does not store the data multiple times.
In the early days of hardware-based snapshots, there were times when taking a snapshot could cause issues with SQL Server. Much like virtualization in its infancy, it was not usually the best fit for SQL Server. Things have come a long way, and the problems that may have plagued snapshots are no longer here, but the fear, uncertainty, and doubt remain.
There are two main hardware-based snapshot variants: application consistent and crash consistent.
Application Consistent Snapshots
An application consistent snapshot quiesces what is running in a physical or virtual server using some sort of integration. For Windows Server, that is the Volume Shadow Copy service (VSS). An application consistent snapshot temporarily freezes the I/O, quiesces memory, flushes all pending I/O to ensure consistency, takes a block-level snapshot, and resumes the I/O. Application consistent snapshots are historically recommended for database systems since they ensure consistency. There is a risk when data is inconsistent due to the possibility of corruption.
Crash Consistent Snapshots
A crash consistent snapshot does not quiesce memory or flush pending I/O. It is a point-in-time capture of what is on the disk when the snapshot is generated. For non-database workloads such as a file server, a crash consistent snapshot works well. These snapshots can be used with SQL Server. To ensure consistency, SQL Server's crash recovery uses the transaction log to roll forward or back transactions as needed when reverting to a crash consistent snapshot.
Crash consistent snapshots, when taken properly, do not introduce database corruption.
Two types of software-based snapshots will be discussed: SQL Server and virtual machine (VM) snapshots.
SQL Server has two snapshot features: database snapshots and Transact-SQL snapshots.
Database Snapshots - Database snapshots were introduced in SQL Server 2005. It is a read-only, static, point-in-time view of a source database in a SQL Server instance. They are only used for reporting or other read access. A database snapshot always exists on the same instance as the source database.
Transact-SQL Snapshot Backups - Transact-SQL snapshot backups were introduced in SQL Server 2022. Unlike database snapshots, this functionality is designed for backups - not read-only use - and combines both hardware- and software-based snapshots. Transact-SQL snapshot backups are meant to be used in conjunction with array-based snapshots to generate a SQL Server backup. Using this functionality will generate the equivalent of an application consistent snapshot without the need for VSS and SQL Writer.
The process to generate a Transact-SQL snapshot backup is as follows:
- Suspend the I/O for the database(s)
- Generate the array-based snapshot(s)
- Issue the Transact-SQL statement to back up the database(s) using the METADATA_ONLY clause.
- Resume the I/O for the database(s)
The process for restoring a database from a Transact-SQL snapshot backup is as follows:
- Mount the array-based snapshot(s) to the target server
- Issue the Transact-SQL statement to restore the database using the METADATA_ONLY clause. Add the NORECOVERY clause if using transaction log backups to roll forward in time.
To see an example of the syntax for a Transact-SQL snapshot backup, see the "Create a Transact-SQL snapshot backup" link listed in the "Next Steps" section.
This method is an extremely fast way to create a database backup and restore a database compared to using a traditional file or filegroup-based method.
A snapshot of a VM captures the state and data at a specific point in time. This functionality allows administrators to revert a VM back to the snapshot. These snapshots are issued via the hypervisor hosting the VM. A single VM can have multiple snapshots to allow for various points in time. A VM snapshot is recommended more in non-production environments.
Array-based Snapshot Benefits for SQL Server Deployments
There are a few reasons to consider implementing snapshots with SQL Server deployments.
Reduce Backup and Recovery Times
Traditional SQL Server backups can potentially take some time to generate as well as restore. Many factors can affect backup and restore times. For some scenarios, creating or reverting to a snapshot will be a faster and better option since those operations are often measured in seconds. As snapshots are pointers to data at a point in time rather than an actual copy of the data, they are extremely fast to create and do not consume much space compared to traditional backups. In most cases, the snapshot can be created almost instantly.
Note: An array-based snapshot is different from a traditional SQL Server backup. A snapshot supplements and can enhance a SQL Server environment (see later in this article for benefits).
Lower Storage Consumption
Storing multiple copies of data and backups consumes quite a bit of space. With the combination of deduplication in FlashArray and the fact that snapshots point back to the original pages, leveraging hardware reduces the overall capacity needed for databases and multiple copies.
Protect Against Ransomware, Malware, and Human Error
Ransomware is something every company, small or large, needs to safeguard against. For example, according to Forbes, attacks on United States-based healthcare systems have cost the economy nearly $8 billion. Malware is another security consideration.
Did someone issue an UPDATE without a WHERE clause? DROP a table accidentally? Update the operating system, and now things are not functioning?
Servers and data must be protected to ensure recovery if things like this occur. An array-based snapshot is one method to protect against these conditions and quickly revert to a previous state.
FlashArray has immutable snapshots that cannot be impacted by ransomware or malware even if systems are affected, nor can they be deleted. Immutable snapshots help the data recovery process since companies can revert to the latest snapshot.
Refresh Data in Non-Production Environments and Reporting Data
Array-based snapshots are ideal for refreshing data in non-production environments like development, testing, staging, and reporting. Why spend hours restoring a traditional SQL Server backup? Use snapshots to revert development, test, and QA environments. Developers mess up their environment? Revert back to a snapshot quickly.
Need a newer copy of data elsewhere, especially for reporting? Stale data that is many hours, days, or weeks old may not be optimal. While a snapshot is a single point in time because it can be generated and then presented to another host quickly, refreshing allows more current data to be used without it taking forever to refresh.
Seed Always On Availability Group Secondary Replicas
They are also helpful in auditing between points in time. The key is that snapshots are point-in-time views of the data. You can create multiple snapshots and compare the data between snapshots or even the source database.
Array-Based Snapshot Considerations
This section will discuss storage-based snapshot considerations for SQL Server deployments.
Some nuances must be configured to ensure SQL Server systems can use snapshots properly, such as where data and transaction log files are placed, how volumes with data and transaction logs are grouped so a snapshot can be generated simultaneously, and more. For virtualization, depending on the configuration, a snapshot may affect all VMs on a volume or only one, depending on the storage configuration.
Work with Other Administrators to Determine a Snapshot Strategy
For storage-based snapshots to be successful in an environment, DBAs must work with other administration teams, such as storage and virtualization. Since "snapshot" is a generic term that means different things to different people, clarifying where a snapshot may occur is crucial along with the impact on SQL Server since snapshots generated on hardware may not be done by DBAs. Because hardware-based snapshots in the past have impacted database systems negatively, removing the fear, uncertainty, and doubt around them is important. None of this happens without communication.
For example, as part of a recovery plan, will array-based snapshots for VMs or databases be used in certain scenarios? SQL Server backups? Know when each will be used and how the teams will work together to ensure the company is back up and running. When transaction log backups will be needed, that may involve more than one team.
Some general caveats:
- If the source of the snapshot has any corruption, that will also exist in the snapshot.
- Snapshots are not long-term, archival storage. That should be done via a traditional SQL Server backup.
- When configuring storage, understand that because a snapshot is a picture of an entire volume, the total volume's contents will be reverted if used.
- Transaction log backups cannot be used to roll a database forward if the database is in the Simple recovery model.
- Have a process for cleaning up older snapshots that are no longer needed.
Call to Action - Use Array-Based Snapshots for SQL Server
Array-based snapshots, whether application or crash consistent, are not a new technology. They are proven and work well. In the early days of hardware-based snapshots, some may have encountered challenges with SQL Server, but modern storage solutions are different. Array-based snapshots make SQL Server deployments more agile by enabling scenarios, especially ones that involve recovery or moving data, that do not exist natively in the operating system or SQL Server. There is no better time to start conversations in your organization to see how array-based snapshots can be leveraged.
To learn more about the information presented in this article, consult these links:
- Snapshots vs. Backups vs. Replication: What's the Difference?
- The Advantages of SQL Server Snapshots
- Take and Store Nearly Half a Million Snapshots with Purity 6.4.5
- Microsoft SQL Server - How do snapshots make your life easier?
- Backups, Snapshots, Microsoft SQL Server, and FlashArray
- Overview of Virtual Machine Snapshots in vSphere
- Create a Transact-SQL snapshot backup
- Database snapshots (SQL Server)
- PureStorage.FlashArray.Backup PowerShell Module
- Seeding an Availability Group Replica from Snapshot
- Snapshots for Dummies
About the author
View all my tips
Article Last Updated: 2023-11-14