Getting started with SQL Server clustering

Problem

We have a suite of databases that support a medical care system. These products have progressed in importance over the years to a point where our health care professionals rely on them for critical patient care. These databases must be available 24×7 with minimal downtime for maintenance or patching and low risk of downtime related to any hardware failures. Some of these databases are quite large and restoring from a backup set would take longer than is allowable to the user base. Our Server Engineering Team has created a Windows 2003 cluster for me to use as a solution, but I am not familiar with the technology. Can you provide me with the basics of what clustering is and how it is going to help me keep these databases consistently accessible to the users that rely on them and the patients whose lives may depend upon them?

Solution

Your environment is indicative of so many SQL Server instances deployed today. Many companies have SQL Server instances that were primarily used for non-mission-critical databases. They relied on Oracle (for the most part) to host their tier-1 databases. Over the years Microsoft SQL Server has matured into a solid competitor and a replacement to Oracle and we the Database Administrators find ourselves hosting critical databases on hardware that is not structured to provide highly available solutions or architected in a manner that does not provide redundancy.

In your case you’ve been given the hardware you need to produce a highly-available database environment. So let’s take a step back from the step-by-step instructions we typically provide at MSSQLTips.com and get a little theoretical for a moment.

What is Clustering?

A Microsoft SQL Server Cluster is nothing more than a collection of two or more physical servers with identical access to shared storage that provides the disk resources required to store the database files.

These servers are referred to as “nodes”. Each of the nodes talk to one another via a private network, sending a heartbeat signal between them. Should one node not communicate its heartbeat to the other node in the cluster the secondary node will take ownership of any dependent services being run by the node that lost communication. This process is referred to as “failover”.

A failover can occur both automatically (a server’s heartbeat stops communicating) or manually. A manual failover is beneficial in the event that patching or some other form of maintenance is required at the physical server level. You would typically implement clustering in order to ensure that if you ever encounter hardware failure on the physical server hosting your SQL instance, your databases would continue to be available for dependent applications and their users.

Unlike other clustering technologies that are implemented for better performance or for increased processing power via load-balancing, SQL clusters are designed for providing highly-available databases; eliminating downtime associated with hardware failure. This architectural concept is referred to as “High Availability Clustering” or “HA Clustering” for short. The service or groups of services that are hosted on a clustered node are respectively referred to as resources and resource groups. Since these resources must be available to all nodes in a cluster then they must reside on a shared disk array in the form of SAN-NAS disk. Each resource group will be mapped to a logical drive that is physically hosted on the shared disk array and will also have it’s own associated IP address and network name.

The SQL Server Installation Process on a Cluster

The SQL Server installation process detects when an installation is being attempted on a clustered node and will prompt you as to whether you wish to configure the SQL instance as clustered or not. If you proceed with creating a clustered instance of SQL Server, the instance will be hosted on a “Virtual” Server. Resources such as data and log files will be created on the shared SAN-NAS disk for SQL Server, SQL Server Agent, and Full-Text Indexing.

If selected in the installation process, Notification Services and Analysis Services are also cluster-aware in SQL Server 2005. Conversely, the associated program files for the instance will be installed on the local drives of each of the clustered nodes in an identical fashion and registry values are set identically across all clustered nodes. Since the “Virtual” server resides solely on the SAN it can be “owned” by any of the nodes you allow. Each of the nodes can run these resources in identical fashion because each physical server/node has the program files and identical registry settings necessary to run the SQL instance.

Furthermore, the users are oblivious to the underlying fluidity of the server. They connect to it as they would any other physical server: by server name (virtual server name in this case) if the default instance or by virtual server name\instance name if a named instance. This is key for application connectivity. Since the SQL instance simply changes ownership during a failover, connection strings the applications rely on to connect to their databases need not be recoded; the physical server may become unavailable, but the virtual server persists after the failover.

Active/Active or Active/Passive Clustering

Clusters are often referred to as either Active/Active or Active/Passive. Just as you would expect by the name, in an Active/Active cluster there will be two or more nodes, each one owning an instance of Microsoft SQL Server. If one node fails, the instance it owns would fail over to the other node, running along side (and contending for resources with) the other instance. An Active/Passive architecture requires that no matter how many nodes make up the cluster, at least one node is not the owner of an instance of SQL Server. It is “passive” and only exists to accept a failover of a node hosting a SQL instance in the event of a failover.

Current Microsoft licensing policies require you to only license the active nodes running Microsoft SQL Server. The passive node need not be licensed.

How Many Nodes?

Today’s clustering technology under Windows 2003 and Microsoft SQL Server 2005 Enterprise Edition allows for up to eight nodes to be combined into a single cluster. The release of Windows 2008 and Microsoft SQL Server 2008 Enterprise Edition will bring with it the ability to double that to sixteen nodes. (You are limited to two nodes if you utilize SQL Server Standard Edition.) Do you want to cluster multiple databases of various SLAs within many nodes on a single cluster? Is it beneficial to dedicate a two-node cluster to a single database? The answer is: “It Depends.” We look into this in detail in a future tip.

Pros and Cons

While clustering protects you from hardware failure relating to the server hosting the SQL Server instance, it does not protect you from media failure. Unlike replication, database mirroring, or log shipping there is only a single copy of your database. If the SAN-NAS encounters a failure then you could not only conceivably incur downtime, but possibly data loss. It is recommended that you incorporate redundancy of your SAN-NAS or database mirroring with your clustering configuration to protect you from media failures. Hardware and licensing costs may be high. In an Active/Passive clustering model you’ll purchase hardware you hope to never need to use. The cluster build is more complex than a standalone server setup. The physical build of the cluster is outside the scope of this discussion however. Additional benefits for clustering include simplicity for installation of SQL and ease of administration and maintenance.

Summary

There is plenty to consider when planning on clustering SQL Server. Hopefully you were able to get a better understanding of what clustering is and an idea of the terminology associated with clustering SQL Server 2005. In the remaining tips in this series we will review these items in greater detail, walking you through the decisions you’ll make on architecture, the installation process, the tools available to monitor the state of the cluster and its resources, as well as comparing clustering to the other high-availability options associated with Microsoft SQL Server.

Next Steps

Stay tuned for detailed tips on each of the following clustering topics: SQL Server Installation ProcessActive/Active v. Active/Passive ArchitectureMultiple instances – Multiple Nodes: Why, When, and How? Pros and Cons of High-Availability Options.
Learn more about hardware options for the SQL Server DBA here.
MSSQLtips.com provides a variety of information on Database Mirroring.

Tim Ford

Tim Ford is a Senior Database Administrator with MindBody in San Luis Obispo, California and is in the process of relocating west to the Pacific Northwest from Michigan. Since 2010 he’s produced Microsoft Data Platform training events branded as SQL Cruise from Alaska to the Caribbean and the Mediterranean at Tech Outbound, an events company specializing in technical training in unconventional locations. His SQL Cruise events take place on cruise ships in the Caribbean, Alaska, and the Mediterranean. Tim also is the Executive VP of Marketing for PASS, the global association for Microsoft data professionals. He also is a contributing author for itprotoday. Tim loves helping people find their true potential through education and building networks between Thought Leaders in various fields and those who are just starting on their careers or struggling to find their footing in established careers. If you’re looking for this sort of experience then check out the next SQL Cruise event taking place this August in Seattle.

MSSQLTips Awards: Acheiver (75+ tips) – 2010