By: Carla Abanes | Comments (4) | Related: > Clustering
Problem
You have a new project that requires a primary database server in one building and secondary database server in another building to be configured as a Failover Cluster. With this setup the participating nodes will have different subnets. You propose to use the SQL Server 2012 AlwaysOn feature and you are well aware that multiple subnets are supported by the AlwaysOn feature. You may have already setup AlwaysOn in SQL Server 2012, but what needs to be done across multiple subnets? What are the important points to consider for this scenario?
Solution
One of the enhancements to AlwaysOn FCI is the support for multi subnet cluster setup. The ability to implement a multi-subnet FCI has always been included in older versions of SQL Server, but it required a virtual local area network (VLAN). SQL Server 2012 removes that requirement, enabling multi-subnet FCI to be more commonly adopted as a high availability and disaster recovery solution.
In this tip, I share several points to consider when implementing this solution.
SQL Edition
You will need SQL Server 2012 Enterprise Edition for a multi subnet cluster setup.
Storage
Multi subnet FCI will require storage level replication to maintain a copy of the database at the disaster recovery site. Though you have separate copies of the database file at each site, to SQL Server this looks like a shared storage solution.
When choosing storage replication technology, you must ensure that it meets the requirements for SQL Server storage and that the database and logs are synchronously and consistently replicated. The key concern here is that it must adhere to the write ordering requirements of SQL Server and to make sure that the data is consistent even in the event of network failure.
File Share Witness
The purpose of the FSW is to have something else that can count as a vote in situations where the number of configured nodes isn't quite enough for determining a quorum. A FSW is more likely to be used in multi-site clusters or where there is no common storage. A FSW does not store cluster configuration data like a disk. It does, however, contain information about which version of the cluster configuration database is most recent. Other than that, the FSW is just a share.
When choosing that FSW location, it should be outside of the cluster, not on either node participating in the cluster. The location must also not be found in the SAN. This actually happened with me during my setup. I realized that the path created by our SAN team could not resolve when I continued with my AlwaysOn configuration, so I asked for a shared path folder instead.
IP Addresses
SQL Server 2012 now supports failover cluster nodes residing on different subnets. The introduction of "OR" logic allows the use of 2 IP addresses. This means that IP addresses can reside in different subnets, practically eliminating the need for a VLAN.
From the diagram below, on each side of your subnet, you will need to request the following from your Domain or Network Administrator:
- Cluster IP for each node
- Cluster name for each node
- Availability Group IP for each node
- Availability Group name
Now, in order for SQL Server 2012 to deliver very high availability, these IP addresses will be registered with DNS by default.
Although you are using multiple subnets, the nodes participating in your cluster must still reside in the same domain.
Network
If you are doing the setup of AlwaysOn in a multi site cluster, make sure to tune the heartbeat network traffic.
The heartbeat frequency, also called subnet delay, is once every second regardless of the subnet configuration. By default, if a node misses a series of 5 heartbeats, another node initiates a failover. The range for this value, called subnet threshold, is 3 to 10.
Determine the value for the Time-to-Live (TTL) setting in the DNS. By default this value is set to 20 minutes (1200 seconds). For SQL Server 2012, we recommend that this value be changed to 60 seconds. You may run the cluster command below to do this:
If you are not familiar with your Network Name Resource, like me, you can use this command:
It should display all the network resource available to you. Look for the SQL Network Name value and that should be it.
Use this command to check the new value of your new TTL setting:
Another setting to update is the DNS replication delay. This value is set to 180 minutes by default. You will also need to set the replication frequency to at least 15 minutes.
In the event of fail over to another subnet, the online IP address changes accordingly. The Windows Server failover cluster issues a DNS update as soon as the network name resource comes online. But these changes do not take effect immediately. This is because of the local DNS cache on the client machines and also because of the DNS replication latency if the cluster node and the client machines are not using the same DNS server. Therefore, the client machines cannot resolve the network name, and until the DNS replication is complete and the local DNS cache times out causing connection failures.
Next Steps
- You might want to build your own test environment where you can test the setup of clusters and SQL Server 2012 with AlwaysOn. It would be a great opportunity to learn more about clustering and networking aside from the database skills that we already have.
- Check out these related resources:
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips