This is a standard outline of the process that should be followed for configuring and installing a new cluster for SQL Server use and one or more SQL Server instances (SQL 2000 and 2005 information included). The outline starts with reference materials at the top, followed by a general outline of the process that will be followed and a listing of some pre-work that should be completed prior to starting with the forming of the cluster and/or SQL installation. This same process can be followed for a non-clustered SQL installation, simply ignore the cluster-specific portions.
I strongly recommend you turn on/show the Document Map when viewing within Word, it shows a great outline of the document that you can use to quickly and easily navigate within the document.
KB's and TechNet Articles
Cluster Overview, BP's, and Architecture
1. Technical Overview of Windows Server 2003 Cluster Services http://www.microsoft.com/windowsserver2003/techinfo/overview/clustering.mspx
2. Server Clusters: Cluster Configuration Best Practices for Windows Server 2003 http://www.microsoft.com/downloads/details.aspx?FamilyID=98bc4016-31a1-42fb-9730-4fab59cf3bf5&displaylang=en
3. Backup and Recovery Best Practices for Windows Server 2003 http://www.microsoft.com/technet/prodtechnol/windowsserver2003/technologies/clustering/sercbrbp.mspx
4. Best Practices for Securing server clusters http://www.microsoft.com/technet/prodtechnol/windowsserver2003/library/ServerHelp/f64e46ba-2d09-4f1a-ba9c-f2b1f71821eb.mspx
5. Windows Cluster Service Troubleshooting and Maintenance http://www.microsoft.com/downloads/details.aspx?FamilyID=a2439406-4723-4a94-bc8c-2ff07721ca96&DisplayLang=en
6. Windows Server 2003 Clustering Services Technical Library http://www.microsoft.com/technet/prodtechnol/windowsserver2003/technologies/genclust.mspx
7. What's new in Clustering Technologies (2003) http://www.microsoft.com/windowsserver2003/evaluation/overview/technologies/clustering.mspx
8. Interpreting the Cluster Log http://www.microsoft.com/resources/documentation/Windows/2000/server/reskit/en-us/Default.asp?url=/resources/documentation/windows/2000/server/reskit/en-us/distrib/dsdg_icl_nnti.asp
9. Server Clusters: Frequently Asked Questions for Windows 2000 and 2003 http://www.microsoft.com/technet/prodtechnol/windowsserver2003/technologies/clustering/sercsfaq.mspx
10. Windows Server 2003 Server Cluster Architecture http://www.microsoft.com/windowsserver2003/techinfo/overview/servercluster.mspx
1. Recommended private "Heartbeat" configuration on a cluster server http://support.microsoft.com/?id=258750
2. Network Configuration Best Practices for Windows 2000 and Windows Server 2003 http://www.microsoft.com/technet/prodtechnol/windowsserver2003/technologies/clustering/clstntbp.mspx
3. Knowledge Base article 175767, "Expected Behavior of Multiple Adapters on Same Network." http://support.microsoft.com/?id=175767
4. Windows Hardware Developer Central Paper: Windows Network Task Offload: http://www.microsoft.com/whdc/device/network/taskoffload.mspx
1. Guide to Creating and Configuring a Server Cluster Under Windows Server 2003 http://www.microsoft.com/downloads/details.aspx?FamilyID=96f76ed7-9634-4300-9159-89638f4b4ef7&DisplayLang=en
2. Using Microsoft Virtual Server 2005 to Create and Configure a Two-Node Microsoft Windows Server 2003 Cluster http://www.microsoft.com/technet/prodtechnol/virtualserver/deploy/cvs2005.mspx
3. Installation order for SQL Server 2000 Enterprise edition on Microsoft Cluster Server http://support.microsoft.com/kb/q243218/
4. Server Clusters: Rolling Upgrades. Upgrading to Windows Server 2003 http://www.microsoft.com/technet/prodtechnol/windowsserver2003/technologies/clustering/rllupnet.mspx
5. Installing and Upgrading on Cluster Nodes http://www.microsoft.com/technet/prodtechnol/windowsserver2003/library/ServerHelp/ca612aad-93fa-49df-a33f-d0a94ac9886b.mspx
6. Geographically Dispersed Clusters in Windows Server 2003 http://www.microsoft.com/windowsserver2003/techinfo/overview/clustergeo.mspx
7. How to determine the appropriate page file size for 64-bit versions of Windows Server 2003 or Windows XP http://support.microsoft.com/kb/889654
8. Server Clusters: Security Best Practices: http://technet2.microsoft.com/windowsserver/en/library/a2a735cd-e10b-45c8-b288-186af4e1095e1033.mspx?pf=true
9. Recommended hotfixes for Windows Server 2003-based server clusters http://support.microsoft.com/kb/895092/en-us
10. Recommended hotfixes for Windows 2000 Service Pack 4-based server clusters http://support.microsoft.com/kb/895090/en-us
1. iSCSI Cluster Support: Frequently Asked Questions http://www.microsoft.com/windowsserver2003/technologies/storage/iscsi/iscsicluster.mspx
2. Microsoft Windows Clustering: Storage Area Networks http://www.microsoft.com/windowsserver2003/techinfo/overview/san.mspx
3. Server Clusters: Storage Best Practices for Windows 2000 and 2003 http://www.microsoft.com/technet/prodtechnol/windowsserver2003/technologies/clustering/storbp.mspx
4. Server Clusters : Storage Area Networks - For Windows 2000 and 2003 http://www.microsoft.com/technet/prodtechnol/windowsserver2003/technologies/clustering/starenet.mspx
5. SQL Server 2000 I/O Basics http://www.microsoft.com/technet/prodtechnol/sql/2000/maintain/sqlIObasics.mspx
6. Knowledge Base article 317162 "Supported Fibre Channel Configurations" http://support.microsoft.com/?id=317162
1. Quorums in Microsoft Windows Server 2003 Clusters http://www.microsoft.com/windowsserver2003/techinfo/overview/clusterquorums.mspx
2. Server Clusters: Majority Node Set Quorum http://www.microsoft.com/technet/prodtechnol/windowsserver2003/technologies/clustering/majnsfl.mspx
3. Knowledge Base article 280353, "How to Change the Quorum Disk Designation" http://support.microsoft.com/?id=280353
4. Knowledge Base article 280345 "Quorum Drive Configuration Information" http://support.microsoft.com/?id=280345
5. Server Clusters: Quorum Options - Windows Server 2003 http://www.microsoft.com/downloads/details.aspx?FamilyID=4dfb478c-83b8-48a4-8e7c-847e49271e8e&displaylang=en
1. Knowledge Base article 301600, "How to configure MSDTC on a Windows Server 2003 cluster" http://support.microsoft.com/?id=301600
2. Knowledge Base article 301600, "How to enable network DTC access in Windows Server 2003" http://support.microsoft.com/kb/817064
Tools1. Clusrest.exe: Cluster Quorum Restore Utility http://www.microsoft.com/windows2000/techinfo/reskit/tools/existing/clusrest-o.asp
2. Cluster Diagnostics and Verification Tool (ClusDiag.exe) http://www.microsoft.com/downloads/details.aspx?FamilyID=b898f587-88c3-4602-84de-b9bc63f02825&DisplayLang=en
3. Cluster Server Recovery Utility (ClusterRecovery.exe) http://www.microsoft.com/downloads/details.aspx?displaylang=en&familyid=2BE7EBF0-A408-4232-9353-64AAFD65306D
5. ClusDiag.exe: Cluster Diagnostics and Verification Tool http://www.microsoft.com/downloads/details.aspx?FamilyID=b898f587-88c3-4602-84de-b9bc63f02825&DisplayLang=en
6. SqlIoSim http://support.microsoft.com/kb/231619
1. SQL Server 2000 Failover Clustering http://www.microsoft.com/technet/prodtechnol/sql/2000/maintain/failclus.mspx
2. Microsoft SQL Server 2000 High Availability Series http://www.microsoft.com/technet/prodtechnol/sql/2000/deploy/sqlhalp.mspx
3. SQL Server 2005 Failover Clustering White Paper http://www.microsoft.com/downloads/details.aspx?FamilyID=818234DC-A17B-4F09-B282-C6830FEAD499&displaylang=en
4. The Microsoft SQL Server support policy for Microsoft Clustering http://support.microsoft.com/kb/327518/en-us
5. How to cluster Microsoft SQL Server 2005 by using Microsoft Virtual Server http://support.microsoft.com/default.aspx/kb/891798
6. Using Microsoft Virtual Server 2005 to Create and Configure a Two-Node Microsoft Windows Server 2003 Cluster https://www.microsoft.com/technet/prodtechnol/virtualserver/deploy/cvs2005.mspx
1. KB 814607, "Microsoft support for server clusters with third-party system components" http://support.microsoft.com/default.aspx/kb/814607/en-us
2. Cluster log definition, sizing, etc. http://support.microsoft.com/kb/168801/EN-US/
3. Failover Cluster Troubleshooting http://msdn2.microsoft.com/en-us/library/ms189117.aspx
4. Knowledge Base article 258469: "Cluster Service May Not Start After Restricting IP Ports for RPC" http://support.microsoft.com/?id=258469
5. Knowledge Base article 250355, "Antivirus Software May Cause Problems with Cluster Services" http://support.microsoft.com/?id=250355
6. Knowledge Base article 273673, "INF: SQL Virtual Server Client Connections Must Be Controlled by Clients" http://support.microsoft.com/?id=273673
7. Knowledge Base article 298723, "BUG: SQL Mail Not Fully Supported in Conjunction with Cluster Virtual SQL Servers" http://support.microsoft.com/?id=298723
8. Knowledge Base article 263556, "INF: How to Configure SQL Mail" http://support.microsoft.com/?id=263556
9. Knowledge Base article 289828, "PRB: SQL Server Setup in a Cluster Environment Encounters an Access Violation If You Have a Long Computer Name" http://support.microsoft.com/?id=289828
10. Knowledge Base article 197047 "Failover/Failback Policies on Microsoft Cluster Server" http://support.microsoft.com/?id=197047
11. Knowledge Base article 325343 "How To Perform an Emergency Shutdown in Windows Server 2003" http://support.microsoft.com/?id=325343
12. Knowledge Base article 286342 "Network failure detection and recovery in Windows Server 2003 Clusters" http://support.microsoft.com/?id=286342
13. Knowledge Base article 306677, "IPSec Is Not Designed for Failover" http://support.microsoft.com/?id=306677
14. Knowledge Base article 235529, "Kerberos Support on Windows 2000 Based Server Clusters" http://support.microsoft.com/?id=235529
15. Knowledge Base article 283794, "Problems Using Certificate with Virtual Name in Clustered SQL Servers" http://support.microsoft.com/?id=283794
16. Knowledge Base article 276553, "HOW TO: Enable SSL Encryption for SQL Server 2000 with Certificate Server" http://support.microsoft.com/?id=276553
17. Knowledge Base article 316898, "HOW TO: Enable SSL Encryption for SQL Server 2000 with MMC" http://support.microsoft.com/?id=316898
18. Knowledge Base article 301520, "SQL Server 2000 cluster does not install on Windows Server 2003-based computers where Terminal Services is installed" http://support.microsoft.com/?id=301520
19. Knowledge Base article 815431, "PRB: Installation of a named instance of SQL Server 2000 virtual server on a Windows 2003-based cluster fails" http://support.microsoft.com/?id=815431
20. Knowledge Base article 321063, "HOW TO: Troubleshoot the "Setup Failed to Perform Required Operations on the Cluster Nodes" Error" http://support.microsoft.com/kb/321063
21. Knowledge Base article 327518, "The Microsoft SQL Server support policy for Microsoft Clustering" http://support.microsoft.com/kb/327518/en-us
22. Many domain trusts in PDC, http://support.microsoft.com/kb/910070/en-us
Before beginning, ensure all appropriate documentation and checklists have been gathered, filled out, reviewed, and are available. This includes the following:
This is the basic installation order when installing a clustered service on any version of Windows Server 2003:
1. Gather Pre-Work documentation from above
2. Install Windows Server 2003 Enterprise Edition
3. Install any necessary Windows Server 2003 service packs or hot fixes.
a. Recommended hotfixes for Windows Server 2003-based server clusters http://support.microsoft.com/kb/895092/en-us
b. Recommended hotfixes for Windows 2000 Service Pack 4-based server clusters http://support.microsoft.com/kb/895090/en-us
4. Complete any server cluster pre-installation tasks required.
5. Finalize "checklist_ServerCluster_preInstall.doc" checklist
6. Create the server cluster.
7. Create the clustered MS DTC.
8. Complete any server cluster post-installation tasks required.
9. Finalize "checklist_ServerCluster_postInstall.doc" checklist
10. Finalize "checklist_SqlServer_preInstal.doc" checklist
11. Install SQL Server 2000. SQL Server 2000 SP4 is recommended.
a. Or Sql 2005.
12. Install appropriate Sql Server service pack
a. NOTE: SQL Server 2000 Service Pack 2 and earlier versions are not supported under Windows Server 2003.
13. Install any necessary SQL Server hotfixes.
14. Perform Post-SQL Install Tasks
15. Verify the SQL services
16. Finalize "checklist_SqlServer_postInstall.doc" checklist
Ensure to update the base OS with all latest SP's and appropriate hotfixes here. Run windows update to capture critical updates, and ensure to review the following KB's:
- Recommended hotfixes for Windows Server 2003-based server clusters http://support.microsoft.com/kb/895092/en-us
- Recommended hotfixes for Windows 2000 Service Pack 4-based server clusters http://support.microsoft.com/kb/895090/en-usCan also run the MBSA on the system prior to installing the cluster or continuing - here is the website: http://www.microsoft.com/technet/security/tools/mbsahome.mspx. This will pick up any necessary security vulnerabilities.
See the ‘checklist_ServerCluster_preInstall.doc' document for additional info.
Run ClusDiag.exe to verify node configuration.
Run syscompare.exe to verify node synchronization (if available).
Ensure ALL 3rd party and proprietary software is installed, configured, and signed off on by all other appropriate teams prior to cluster configuration (with exceptions for any software that requires cluster interaction and/or inclusion). This includes:
- Anti-virus software
- Backup software and solutions
- Custom email, communications, etc. software
- Security, audit, encryption, etc. software
- Monitoring systems and solutions, including agents
- HBA/Disk management software, drivers, muti-pathing drivers, etc.
- Network card configuration software (i.e. teaming software/drivers)
Network Card Configuration
Public Network1. From your desktop, right-click My Network Places, and select Properties. Under Windows Server 2003, you might need to enable the Classic Start menu view from the Properties menu on the taskbar to see this.2. In the Network And Dial-up Connections (Windows 2000 Server) or Network Connections (Windows Server 2003) window, select the network card. Rename this to something recognizable and usable, such as Public Network, by selecting it, right-clicking, and selecting Rename. This is the same value on all nodes3. Select the Public network card, right-click, and select Properties.4. Select Internet Protocol (TCP/IP) and click Properties. Set the static IP address of the card to a valid IP address on the externally facing network. This address is different for each node of the server cluster. These addresses will all be on the same subnet, but a different subnet from the private network. Click OK.5. Make sure the Subnet mask is correct.6. Enter your default gateway.7. Enter your primary and secondary DNS servers.8. Click OK to return to the Public Network Properties dialog box. Click Configure.9. In the Properties dialog box for the network card, select the Advanced tab.10. For the External PHY property, set the value for the correct network speed. Set this to be the same on each node. Click OK.11. Click OK to close the Public Network properties dialog box.
1. From your desktop, right-click My Network Places, and select Properties. Under Windows Server 2003, you might need to enable the Classic Start menu view from the Properties menu of the taskbar to see this.2. In the Network And Dial-up Connections (Windows 2000 Server) or Network Connections (Windows Server 2003) window, select the network card. This network card is located only on the private network on the approved subnets. Rename this to something recognizable and usable, such as Private Network, by selecting it, right-clicking, and selecting Rename. This is the same value on all nodes.3. Select the Private network card, right-click, and select Properties.4. Make sure that Client For Microsoft Networks, Network Load Balancing, File And Printer Sharing For Microsoft Networks, and any other options are not selected5. Select Internet Protocol (TCP/IP) and click Properties. Set the static IP address of the card to a valid IP address on the externally facing network. This address is different for each node of the server cluster, and must be in the proper class. These addresses will all be on the same subnet, but a different subnet from the public network. Click OK.6. Make sure the subnet mask is correct.7. Do not enter a default gateway.8. Do not enter any DNS servers.9. Click Advanced.10. Select the WINS tab of the Advanced TCP/IP Settings dialog box, shown in Figure 5-14, and select Disable NetBIOS Over TCP/IP if you are not on an MNS cluster. Click OK.11. Click OK to return to the Private Network Properties dialog box. Click Configure.12. Click Advanced. In the Properties dialog box for the network card, select the Advanced tab.13. For the External PHY property, as shown in Figure 5-12, set the value for the correct network speed. Set this to be the same on each node. Click OK.14. Click OK to close the Private Network Properties dialog box.15. If you are on a Windows 2000 server, add the following registry key and its associated values only if you are using a crossover cable:HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\Tcpip\ParametersValue Name: DisableDHCPMediaSenseData Type: REG_DWORDData: 1 WARNING: Do not perform this step if you are using Windows Server 2003.
Network Priority/Binding Order (server level)
Configure the networks so that they have the right priority and one will not impede the other. At the server level, the public networks should have the priority. You can configure the network by following these steps:1. On your desktop, right-click My Network Places and select Properties. If you are using Windows Server 2003, you might need to enable the Classic Start menu view from the Properties menu on the taskbar to see this.
2. From the Advanced menu, select Advanced to open the Advanced Settings dialog box3. All public, or externally faced, networks should have priority over the private ones. If this is not the case, set the proper order and click OK. If you have multiple networks, set them in the proper order. This order is the same on all nodes.4. Close the Network Connections (Windows Server 2003) or Network And Dial-Up Connections (Windows Server 2000) window.5. Repeat this procedure for each node of the cluster.
Network Data Throughput
Configure Data Throughput to ‘Maximize Data Throughput for Network Applications' for each appropriate Network Connection (note that you will only configure this for any non-private networks in a cluster configuration).
1. Network Connections, right click on appropriate connection, properties
2. Select ‘File and Print sharing for MS Networks', properties
3. Server Optimizations tab, select ‘Maximize Data Throughput for Network Apps', OK
4. Note that you'll need to reboot for effect
Network Task Offloads
If the system load isn't going to be network IO bound, consider disabling any offloading capabilities on the NICs. If offloading is required, or will produce potential performance boosts that are desired, consider disabling only "large send offload" (TCP Segmentation Offload) properties on t he NICs.
See the following KB for possible issues and fixes that should be implemented where appropriate: http://support.microsoft.com/kb/KB909665/en-us
For an overview on NDIS offload capabilities available with NDIS 5 (and later) miniport drivers, go here: http://www.microsoft.com/whdc/device/network/taskoffload.mspx
OS Performance Options
Configure the OS performance options appropriately for the server.
1. Right-click on My Computer and select Properties
2. Go to the Advanced Tab, then click the Settings button under the Performance group
3. On the Visual Effects tab, choose the option ‘Adjust for Best Performance'
4. On the Advanced tab, set the Processor Scheduling option to Background Services and the Memory Usage option to Programs
8.3 Naming Structures Enabled
Ensure the 8.3 naming structure is enabled. Verify by opening a command prompt and typing "DIR /X" - this should display both long and 8.3 formatted names where appropriate. If no 8.3 names are returned, edit the following registry key:
The NtfsDisable8dot3NameCreation value must be 0; if the value needs to be changed, a reboot will be needed to make the change take effect. Any directories or components installed prior to making this change will need to be removed and/or uninstalled and recreated/reinstalled before continuing.
UPPERCASE Node/Machine Names
Be sure to configure all nodes/machines in the to be cluster with ALL UPPERCASE machine names. See the following KB:
Page File Configuration
Configure the page file on the system/boot partition appropriately (2048MB min, 2048 max) - recommend a single page file per physical volume for maximum performance, although this is not generally a consideration for dedicated SQL Servers, as the database engine will not page at all most of the time (especially with AWE enabled, as all SQL Server managed memory resides in the non-paged pool at the OS level). There is practically no need for pagefile size to be greater than 2GB (to allow for a memory dump in case of a bug check). Start with a relatively small page file size, but allow to grow if necessary to minimize startup time.
Turn off and disable the automatic update service. Generally speaking, any updates applied to a SQL Server box will be handled during maintenance windows and specific schedules.
Creating a Boot Delay
In a situation where all the cluster nodes boot up and attempt to attach to the quorum resource at the same time, the Cluster service may fail to start. This may occur for example when power is restored to all nodes at the exact same time after a power failure. To avoid such a situation, increase or decrease the ‘Time to Display list of operating systems' setting on each node to a different value, with the node to be the primary owner of the cluster having the lowest setting. To find this setting, right-click My Computer, and then click Properties. Click the Advanced tab, and then click Settings under Startup And Recovery.
Turn off the screen saver on each node in the cluster - configure the screen to turn off instead of using a screen saver (you'd be surprised how many CPU cycles a screen saver can consume on a busy server)
Update the allowed size of the application, system, and security logs to an appropriate size based on the requirements of you environment. Additionally configure the appropriate overwrite settings.
Temp File Locations
Change the Environment Variables for TEMP and TEP from local setting paths to other location (c:\temp or something) for the SQL SERVICE ACCOUNT. This speeds access to temporary data files and locations, rather than having to traverse a buried temp file location (NOTE: This does not have anything at all to do with the temp database or files associated with that database)
Stop and disable any services that are not needed on the server in question. Generally, this includes services such as IIS, Print Spoolers, Auto Updates, etc., but could include any of the following (BE SURE TO VALIDATE WHICH SERVICES ARE REQUIRED IN YOUR ENVIRONMENT PRIOR TO STOPPING/DISABLING):
1. Simple Network Management Protocol (SNMP)
2. World Wide Web Publishing service (IIS)
3. Vendor-specific programs
5. Application Management
7. Distributed Link Tracking Server
8. Fax Service
9. File Replication
10. FTP Service
11. Indexing Service
12. Internet Connection Sharing
13. Intersite Messaging
14. Kerberos Key Distribution Center
15. License Logging Service
16. Logical Disk Manager Administrative Service
18. NetMeeting Remote Desktop Sharing
19. Network DDE
20. Network DDE DSDM
21. Print Spooler Service (if you won't be printing from this server)
22. QoS RSVP
23. Remote Access Auto Connection Manager
24. Remote Procedure Call (RPC) Locator
25. Routing and Remote Access
27. Smart Card
28. Smart Card Helper
29. SMTP Service
31. Utility Manager
32. Windows Installer
33. DHCP client service
34. Task Scheduler
If there is no reason to put antivirus software on the cluster nodes dedicated to a secure SQL Server, do not install the software. If corporate policy dictates that antivirus software must be configured, set the filtering of the antivirus software to exclude the scanning of all SQL Server drives that contain data and log files (or simply exclude extensions MDF, LDF, and NDF. Be sure to also exclude SS extensions if a SQL 2005 box (snapshot files), and possibly .trc files if you plan to trace frequently). Also exclude the \MSCS directory on the quorum if using a disk-based quorum, ideally exclude the quorum disk altogether. If possible, also consider excluding any logging, profiler capture points, perfmon capture points, SQL Server binary file locations, and any other write-intensive paths (i.e. SSIS, DTS, Sql Trace locations, BCP output directories for batch file movements, etc.).
Ensure Terminal Services Server is NOT installed on the nodes in the cluster. SQL Server 2000 cannot run in a cluster when TS Server is also installed. See this KB for more details: http://support.microsoft.com/?id=301520
Verify Network Connectivity
To verify that the private and public networks are communicating properly prior to installing your server cluster, perform the following steps. It is imperative to know the IP address for each network adapter in the cluster, as well as for the IP cluster itself.
Verifying Connectivity and Name Resolution from a Server Node
This method shows how to check both IP connectivity and name resolution at the server level.
1. Open a command window
2. Ping the IP address of all nodes in the cluster (including the node you are on). This must be done for both the public and private networks.
3. Ping the network name of all nodes in the cluster (including the node you're on)
4. Repeat steps 1-3 for every node
Verifying Connectivity and Name Resolution from a Client
This method ensures clients and other servers can access the cluster.
1. Open a command prompt from a client computer
2. Ping each of the public IP addresses of each node in the cluster
3. Ping each of the network names of each node in the cluster
Shared Disk Configuration
Prior to installing your server cluster, you should also configure all disks that will be used in the cluster up front to minimize downtime later should you need to add a disk.
When configuring your server clusters, always configure the first node and the shared disks before you power on the other nodes and allow the operating system to start. You do not want more than one node to access the shared disk array prior to the first node being configured.If your disk controller is not external to your clustered computer, you must turn off write caching within the controller to prevent data loss during a failover. Write-back caching cannot be used on host controllers in a cluster without hindering performance. However, if you use external controllers, you continue to provide performance benefits. External disk arrays are not affected by failover clustering and can synchronize the cache correctly, even across a SCSI bus.
I'm not going to go into detail on creating/formatting/etc. basic disks, but here are a few points to ensure:
1. Partitions do nothing with a SQL cluster - disk resources are clustered at the physical level, so if a single disk is partitioned into multiple partitions, each of those partitions will be presented as a single disk to the cluster.
2. All disks should be Basic disks (dynamic disks are not yet supported in MSCS), NTFS formatted, and have an Allocation Unit Size of:
NOTE: This is recommendation from EMC on their hardware...typical general recommendation would probably be 64k for data/log/tempdb, and possibly 4k up to 1mb for backup disks...consult your storage hardware vendor for appropriate recommended configuration in a Sql Server environment.
a. 64k for disks that will contain SQL data files
b. 8k for disks that will contain SQL log files
c. 16k for disks that will contain SQL tempdb files
d. 4k for disks that will have SQL Backups pushed to them
3. All shared disks, including the quorum disk, must be physically attached to a shared bus. The only exception to this rule is in the case of an MNS cluster.
4. Shared disks must be on a different controller then the one used by the system drive.
5. A dedicated disk with a minimum size of 50 megabytes (MB) to use as the quorum device. A partition of at least 500 MB is recommended for optimal NTFS file system performance. A quorum disk failure could cause the entire cluster to fail; therefore, it is strongly recommended that you use a volume on a hardware RAID array; this is also why you SHOULD NOT USE the quorum disk for anything other than cluster management.
6. Verify that disks attached to the shared bus can be seen from all nodes.
7. Log disks should be RAID level 1 or 01. Data file disks should be RAID 1, 01, or 5 (though not ideal). Quorum disk should be RAID 1 or 01. MSDTC should be RAID 1, 01, or 5 (though not ideal).
8. Log and Data disks should be separate physical disks. Quorum must be a separate physical disk from the Log/Data disks, but the MSDTC can share the Quorum disk, though not ideal.
Before continuing, at this point the "checklist_ServerCluster_preInstall.doc" checklist should be FINISHED!!!!
Under Windows Server 2003, you have three options for configuring your server cluster: through a wizard, via a command line, or through an unattended installation. I only cover the wizard steps here for now.
Wizard Based Install
1. Ensure all nodes but Node 1 are powered down and completely off.
When you use MSCS, you must make certain that one node is in control of the shared SCSI bus prior to the other nodes coming online. Failing to make certain that one node is in control of the shared SCSI bus can cause an application failover to go into an online pending state. Consequently, the cluster either fails on the other node or fails totally. However, if your hardware manufacturer has a proprietary installation process, follow the hardware manufacturer instructions.
2. Start Cluster Administrator. If not yet installed, may need to add ‘Cluster service' from Add/Remove Windows Components.
3. Create new cluster
4. Run through the wizard. First node is the node you are installing from (currently on). Configuration check should pass on all counts, no exceptions.
5. Once cluster is configured, review event, application, security, and cluster logs for errors. Cluster logs are located:
a. %systemroot%\log (this is the same log you probably viewed from the wizard)
1. Boot up the node to be joined, and no others. Nodes already joined/configured should remain powered on.
2. Run Cluster Administrator (don't have to be on the actual node joining)
3. File->New->Node. Select Add Nodes to Cluster, select appropriate cluster to join, OK
4. Run wizard. Configuration check should pass on all counts, no exceptions.
5. Review logs again for errors.
- MSDTC IS required for Sql 2005 cluster configurations - ensure it is configured appropriately prior to Sql installation
Microsoft Distributed Transaction Coordinator (MS DTC) is used by SQL Server and other applications. For a SQL virtual server to use MSDTC, it must also be clustered. You cannot use a remote MS DTC; you must configure an MS DTC resource for your server cluster.
MS DTC is shared for all resources in the cluster. If you are implementing multiple SQL Server instances, they will all use the same MS DTC resource.
MS DTC does require disk space; in general, the size used should be roughly 500 MB. However, some applications might have specific requirements. For example, Microsoft Operations Manager recommends a minimum MS DTC size of 512 MB.
Configuring MSDTC for network access1. Click Start, point to Control Panel, and then click Add or Remove Programs2. Click Add/Remove Windows Components
3. Select Application Server, and then click Details.
4. Select Enable network DTC access, and then click OK.5. Click Next, Click Finish.
6. Stop and then restart the DTC service
7. Stop and then restart Microsoft SQL Server and the other resource manager services that participate in the distributed transaction, such as Microsoft Message Queuing.
Windows 2003 Server MSDTC Configuration
With Windows 2003, the MSDTC setup is much like the old comclust setup with Windows 2000, only you can no longer run comclust.exe. You must now manually create an IP address, network name, and Distributed Transaction Coordinator resource.
When configuring MS DTC, do not use the group containing a disk with the quorum or any of the ones planned for use with SQL Server.1. Configure MSDTC for network access on each node, as detailed in this section above. (see this KB for details if needed: http://support.microsoft.com/kb/817064)2. Start Cluster Administrator. Select the group that has the dedicated disk you will use for MS DTC and rename the group appropriately. NOTE: If configuring the MSDTC within the quorum group, or any existing group with a physical disk, which is by the way not recommended, skip to step #15.3. From the File menu, select New, and then Resource (or right-click the group and select the same options). In the New Resource dialog box, in the Name text box, type an appropriate name, such as MSDTC IP Address. In the Resource Type drop-down list, select IP Address. In the Group drop-down list, make sure the right group is selected. Click Next.4. In the Possible Owner dialog box, all nodes of the cluster should appear as possible owners. If they do not, add the nodes, and click Next.5. In the Dependencies dialog box, select the disk resource in the group you selected from the Available Resources, and then click Add. The disk resource appears in the Resource Dependencies list. Click Next.6. In the TCP/IP Address Parameters dialog box, enter the TCP/IP information. In the Address text box, enter the static IP address that will be used with MS DTC. In the Subnet Mask text box, enter the IP subnet if it is not automatically chosen for you. In the Network To Use list box, select the public cluster network you want to use. Click Finish.7. You will see a message confirming that the IP address is successfully configured.8. In the Cluster Administrator window, the newly created resource appears in the right pane.9. From the File menu, select New, and then Resource (or right-click the group and select the same options). In the New Resource dialog box, in the Name text box, type an appropriate name such as MSDTC Network Name. In the Resource Type drop-down list, select Network Name. In the Group drop-down list, make sure the proper group is selected. Click Next.
10. In the Possible Owner dialog box, all nodes of the cluster should appear as possible owners. If they do not, add the nodes, and click Next.11. In the Dependencies dialog box, the MS DTC IP address resource you configured previously appears in the Available Resources list. Select the resource, and then click Add. The resource appears in the Resource Dependencies list. Click Next.12. In the Network Name Parameters dialog box, type MSDTC, and then click Finish.13. You will see a message confirming that the Network Name resource is successfully configured.14. In the Cluster Administrator window, the newly created resource appears in the right pane.15. From the File menu, select New, and then Resource (or right-click the group and select the same options). In the New Resource dialog box, in the Name text box, type an appropriate name such as MSDTC. In the Resource Type drop-down list, select Distributed Transaction Coordinator. In the Group drop-down list, make sure the proper group is selected. Click Next.16. In the Possible Owner dialog box, all nodes of the cluster should appear as possible owners. If they do not, add the nodes, and click Next.17. In the Dependencies dialog box, select the MS DTC IP address, the MSDTC network name, and the physical disk resources you configured previously from the Available Resources list, then click Add. The resources appear in the Resource Dependencies list. Click Next.18. In the Network Name Parameters dialog box, type an appropriate name such as MS DTC, and then click Finish.19. You will see a message confirming that the Distributed Transaction Coordinator resource is successfully configured.20. In the Cluster Administrator window, the newly created resource appears in the right pane.21. To start the new resources, which are all offline, right-click each one, and then click Bring Online.
After configuring the cluster service, but prior to configuring the SQL service, there are a few tasks that need to be addressed.
Configuring Network Priorities
Besides setting the network priorities at the server level, you need to do it in Cluster Administrator for the cluster. In the cluster, the private network is the ruler. Follow these steps to configure network priorities:1. Start Cluster Administrator. Select the name of the cluster, right-click, and select Properties, or you can select Properties from the File menu.2. Select the Network Priority tab. Make sure that the private heartbeat network has priority over any public network. If you have multiple private and public networks, set the appropriate order.3. Click OK.
Time to configure the network roles to define their functionality within the cluster - here is a list of the network configuration options in Cluster Administrator:
- Enable for cluster use: If this check box is selected, the cluster service uses this network. This check box is selected by default for all networks.
- Client access only (public network): Select this option if you want the cluster service to use this network adapter only for external communication with other clients. No node-to-node communication will take place on this network adapter.
- Internal cluster communications only (private network): Select this option if you want the cluster service to use this network only for node-to-node communication.
- All communications (mixed network): Select this option if you want the cluster service to use the network adapter for node-to-node communication and for communication with external clients. This option is selected by default for all networks.
I'm going to assume that only two networks are in use - which will explain how to configure these networks as one mixed network and one private network. This is the most common configuration. If resources are available, two dedicated redundant networks for internal-only cluster communication are recommended.
1. Start Cluster Administrator
2. In the left pane, click Cluster Configuration, click Networks, right-click Private, and then click Properties.
3. Click Internal cluster communications only (private network), click OK.
4. Right-click Public, and then click Properties
5. Click to select the Enable this network for cluster use check box
6. Click the All communications (mixed network) option, and then click OK.
After configuring the role of how the cluster service will use the network adapters, next prioritize the order in which they will be used for intra-cluster communication. This is applicable only if two or more networks were configured for node-to-node communication. The cluster service always attempts to use the first network adapter listed for remote procedure call (RPC) communication between the nodes. Cluster service uses the next network adapter in the list only if it cannot communicate by using the first network adapter, and so on down the list.
1. Start Cluster Administrator
2. In the left pane, right-click the cluster name (in the upper left corner), and then click Properties.
3. Click the Network Priority tab
4. Verify that the Private network is listed at the top. Move all private networks to the top of the list, and publics to the bottom.
5. Click OK
Quorum Disk Configuration
The Cluster Configuration Wizard automatically selects the drive that is to be used as the quorum device. It will use the smallest partition that is larger then 50 MB. You may want to change the automatically selected disk to a dedicated disk that you have designated for use as the quorum.
1. Start Cluster Administrator
2. Right-click the cluster name in the upper-left corner, and then click Properties.
3. Click the Quorum tab.
4. In the Quorum resource list box, select a different disk resource.
5. If the disk has more than one partition, click the partition where you want the cluster-specific data to be kept, and then click OK.
See this KB article for more details: http://support.microsoft.com/?id=280353
If you are going to be using Kerberos with your cluster, ensure that the Cluster Service account has the appropriate permissions. Then perform the following steps:
1. Start Cluster Administrator.2. Select the Groups tab, and select the Cluster Group resource group. In the right pane, select Cluster Name, right-click, and select Take Offline.3. Once Cluster Name is offline, right-click it and select Properties.4. On the Parameters tab, select the Enable Kerberos Authentication check box, and click OK.5. Bring online.
Changing Size of Cluster Log
The cluster log size defaults to 8 MB, which is not very big. If the cluster log becomes full, the Cluster Service overwrites the first half of the log with the second half. In that case, you can only guarantee that half of your cluster log is valid. To prevent this situation, you must create a system environment variable named ClusterlogSize, following these steps. Make this value great enough to ensure validity of the cluster log.
See KB article 168801 for more info. http://support.microsoft.com/kb/168801/EN-US/.
Verifying Server Cluster Install
Verifying Connectivity and Name Resolution
To verify that the private and public networks are communicating properly, perform the following steps. It is imperative to know the IP address for each network adapter in the cluster, as well as for the IP cluster itself.1. From any node, open a command window2. Ping the server cluster IP Address3. Ping the server cluster name4. Repeat 1-3 for each node5. Repeat 1-3 for client machines
You need to ensure that all nodes can own the cluster resource groups that were created.
1. Verify that all nodes configured for the cluster appear and are online
2. For each cluster group, make sure it can be failed over and back from all nodes in the server by using the Move Group command. Ensure it can fail over to and back from EVERY server in the configuration.
3. For each cluster group, test the Initiate Failure command for given resources...depending on the specified number of restart attempts, the resource should go into a failed state and then come back online on the same node. After all restart attempts have been exhausted, the group should fail over to the next node in line.
4. Stop the Cluster service on a node from the Services manager. The group owned by that node should failover to another node
5. Shutdown an active node - all groups owned by that node should failover to another node
6. Pull the public network cable from an active node. Resources owned by that node should failover to another node
7. Pull the private network cable from one or more nodes. Cluster communication should move to the public network and NO FAILOVERS should occur.
Ensure all nodes can run Cluster Administrator, connect to the cluster, and see online resources for each node
Registry EntriesVerify that the cluster service installation process wrote the correct entries to the registry. You can find many of the registry settings under HKEY_LOCAL_MACHINE\Cluster.
Use the Event Viewer to check for ClusSvc entries in the system log. You should see entries confirming that the cluster service successfully formed or joined a cluster.
NOTE: If performing a named instance install on a Windows 2003 server without the SQL 2000 SP3 or higher direct installation, ensure to configure a named-pipe alias for the virtual server to be installed using the client config utility PRIOR to beginning setup. The pipe should be named something like the following (default) pipe:
See the following KB for more information: http://support.microsoft.com/?id=815431
Also note that while performing the cluster portion of SQL Server 2000 SP3 or SP3a, the setup will reset all cluster resources to the default properties for all resources in the group, whether they are related to SQL Server or not. Make sure to note all settings before you install SQL Server 2000 SP3 or SP3a.
1. Start installation, select components, then database server
2. Click next to bypass welcome
3. Select ‘Virtual Server' in the computer name dialog, type the network name of the virtual machine, click next
4. Enter user info, click next. Agree to the license info, click yes.
5. Enter the IP for the virtual server, select the public network to use from the dropdown, click Add. Repeat if using additional IP's and names. Click Next
6. Select the main SQL Data disk for the instance from the dialog. Additional disks can't be added here, must be added post-installation. Click next
7. Add/Remove the appropriate nodes to/from the virtual server definition. Click Next
8. In the remote info dialog, enter the account info used to configure and administer the server cluster (this is NOT the SQL service account). Click Next
9. Select the default checkbox if this is a default install, or enter the name of the instance in the instance name dialog box. Click Next.
10. In the Destination Folder area of the Setup Type dialog box, verify that the Program Files location is set to a valid local drive on each node (for example, C:\Program Files\Microsoft SQL Server) and that the Data Files location is set to the drive selected on the Cluster Disk Selection dialog box. If you want to configure a specific directory on the drive for the data, click the Browse button. Click Next
11. In the Services Accounts dialog box, select either Use The Same Account For Each Service or Customize The Settings For Each Service. In the Password box type password. Confirm that the Username, Password, and Domain are set to the proper values. If you selected Customize The Settings For Each Service, you will need to enter a Username, Password, and Domain for both the SQL Server and SQL Server Agent services. Click Next.
12. In the Authentication Mode dialog box, select either Windows Authentication Mode or Mixed Mode. If Mixed Mode is selected, enter a password for the sa account. Click Next.
13. Click Next to start copying files
14. In the Choose Licensing Mode dialog box, choose the right licensing scheme, enter the proper value, and click Continue.
15. The SQL Server virtual server will now install. You should see some messages like ‘Setup is performing...', ‘Setup is creating...', etc.
16. When all done, click Finish in the setup complete dialog.
17. You may get prompted to reboot. If so, be sure to reboot ALL nodes. I'd recommend rebooting all of them at this point if time permits anyhow.
The SQL 2005 installer is based on the Microsoft Installer, it does not use 3rd party components (i.e. Wise) to perform the installation.
SQL 2005 requires the use of domain level groups when installing on a cluster. Standalone installations will automatically create local groups for service accounts, however in cluster configurations, domain level groups are required, as local user/group accounts can't be configured for ACL's on shared storage.
For each clustered service in the instance of SQL Server that you are installing, enter the domain and group name in the format <DomainName>\<GroupName>, subject to the following guidelines:
1. The domain and group names must already exist.
2. The account under which SQL Server Setup is running must have privileges to add accounts to the domain groups, or alternatively the service accounts must already exist within the group(s) specified.
3. Each service should use a different domain group.
4. The SQL Server domain groups should not be shared with any other application.
Ensure the following services are configured correctly and running on all nodes:
- Remote Registry
- Cryptographic Services
- Task Scheduler
- MSDTC (including fully enabled and clustered)
- Sql Server VSS Writer (if prior instance is already installed)
Sql Server VSS Writer
If this is not the 1st Sql installation on any of the nodes in the cluster, and you are trying to install this instance to one of said nodes, ensure the Sql Server VSS Writer service is NOT disabled on any of the nodes. The installation uses this service to perform backups of system databases during installation and to remove them post-installation. If you've installed instances prior to this on any of the nodes and disabled the Sql Server VSS Writer service, set it to Manual for the duration of the installation and then disable it again if desired post installation.
IIS and Certificates
If using Reporting Services, you'll need to configure IIS and install a server certificate if you plan to require the use of SSL for reporting services interaction.
Remove prior SQL Native Client installations
SQL remote installations will fail if a prior SNAC pre-requisite information exists on the remote node...be sure to remove any of the pre-existing Native Client and Support files
Terminal Services Installation
If installing via Terminal Services Client/RDP, ensure only a single connection is open to the node that the installation is being executed on. All TS/RDP connections to any other node(s) in the cluster should be logged off (not just disconnected), as well as any additional connections to the installation node. If additional connections are open to any node, setup will fail.
Lock Pages in Memory
Give the SQL Service account/group the lock pages in memory local security permission on all nodes in the cluster if you want the instance to lock buffer pages to preclude the OS from paging it's working set from memory to disk.
Perform Volume Maintenance Tasks
Give the SQL Service account/group the Perform Volume Maintenance Tasks local security permission on all nodes in the cluster if you want the instance to be able to take advantage of instant file initialization (i.e. grow a data file to gigs of size in seconds as opposed to having to zero out entire gigs of space like in Sql 2000)
For the most accurate steps/screen shots for navigating through the installation wizard, see the appropriate appendix in the SQL Server 2005 Failover Clustering White Paper here: http://www.microsoft.com/downloads/details.aspx?FamilyID=818234DC-A17B-4F09-B282-C6830FEAD499&displaylang=en
1. Start installation, select Install -> Server components...
2. Agree to the EULA...
3. Click Install from the ‘Installing Prerequisites' screen to install required files (.NET 2.0, SNAC, Support files)
4. Click next when that is finished
5. Next you will go through the new SCC (system consistency checker)...note any warnings, errors, etc. If you are missing a prerequisite that is required for a component you are planning to install (i.e. IIS for reporting services), quit now and configure the prerequisites.
6. Provide your name, company, and code if required, click next
7. On the next screen you'll pick the components you want to install on the server (simple checklist) including if you plan to configure a failover cluster (for the DB engine and Analysis Services). Additionally, you can click the ‘Advanced' button to get a more familiar Windows Installer tree type view to select components and sub-components, as well as define where to install components to.
NOTE: Only the database engine and analysis service components are cluster aware. SSIS, Notification Services, Reporting Services (web components), and Workstation components are NOT cluster aware.
8. Choose a default instance or named instance on the next screen
9. Enter the name for the virtual SQL Server on the next screen (this is not the same as the cluster name)
10. Define the public IP information for the SQL Server on the next screen
11. Choose the appropriate cluster group that this particular SQL Server instance should reside within. Also choose the proper location for shared SQL Server data files on this screen at the bottom
12. Cluster node configuration - choose the appropriate nodes that this particular SQL Server instance should be installed to and available to run on
13. Enter the account information for a valid administrator on all nodes of the cluster (probably best to use the cluster admin service account here)
14. Next you'll choose the service accounts for use with all or each given service. I always recommend configuring a separate service account for each service that will be run. In addition, pick the service that will start automatically on this screen...note that in a cluster installation, these choices will be disabled as the cluster service is responsible for starting/stopping/operating the services.
15. The domain groups screen in where you'll enter the domain level groups created as a pre-req to the installation.
16. Choose your authentication model (Windows only vs. Mixed)...if you choose mixed, you MUST configure an SA password, and I'd recommend that it be very strong and of the format that nobody would ever want to use it even if they had to.
17. Choose your collation settings. I'd recommend using a Collation designator in lieu of a SQL collation.
18. Choose error and usage report settings
19. Verify settings and begin the installation
20. After the installation, screens will be displayed that will allow you to review logs and other information via hyperlinks if desired.
Client Tools Installation
Once the virtual service installation is complete, client tools and workstation components will only be available on the primary node in the cluster. If these tools need to reside on all nodes, manually install them on each node in the cluster.
After the virtual service installation, if Integration Services was chosen as a component to be installed, it too will only be installed on the primary node in the cluster, as it is not a cluster-aware service. If you want the service to be controlled by the cluster service, you'll have to manually configure it as such at this point. For up to date information on how this is performed, see the topic "How to: Configure Integration Service on a Cluster" in books online or online at msdn (http://msdn2.microsoft.com/en-us/library/ms345193.aspx). The current information is as follows:
SQL Server 2005 Integration Services does not provide special installation options for a failover cluster. However, after installing Integration Services separately on each node of a cluster, you can manually configure it to operate as a clustered service.
To prepare to install Integration Services on a cluster
1. Install and configure a cluster with one or more nodes.
2. Optionally, install clustered services such as the SQL Server 2005 SQL Server Database Engine.
3. Install Integration Services on each node of the cluster individually.
4. Decide whether you will configure Integration Services as a clustered service:
a. In the same group as the SQL Server 2005 SQL Server Database Engine.
b. In a different group than the SQL Server Database Engine.
5. Set the SSIS service to start manually on each node, not automatically (so the cluster service is responsible for starting/stopping)
6. Update the permission to the SSIS Lun appropriately
Configuring Integration Services in the Same Group as SQL Server
In this configuration:
- The loading and saving of packages to the MSDB database is faster.
- Sql Server database engine components and availability will be compromised if the SSIS engine fails and causes and entire group failure at the cluster level.
To configure Integration Services as a clustered service in the same group as SQL Server:
1. Open the Cluster Administrator
2. In the console tree, select the Groups folder
3. In the details pane, select the group to which you plan to add Integration Services
4. On the File menu, point to New and then click Resource
5. On the New Resource page of the Resource Wizard, type a Name and choose "Generic Service" as the Service Type. Do not change the value of Group. Click Next
6. On the Possible Owners page, add or remove the nodes of the cluster as the possible owners of the resource. Click Next
7. To add dependencies on the Dependencies page, select a resource under Available resources, and then click Add. In case of a failover, both SQL Server and the shared disk that stores Integration Services packages should come back online before Integration Services is brought online. After you have selected the dependencies, click Next.
8. On the Generic Service Parameters page, enter MsDtsServer as the name of the service. Click Next.
9. On the Registry Replication page, click Add to add the Registry key that identifies the configuration file for the Integration Services service. This file needs to be located on a shared disk that is in the same group for the Integration Services service, and fails over to the next node along with for the Integration Services service. In the Registry Key dialog box, type SOFTWARE\Microsoft\MSDTS\ServiceConfigFile. Click OK, then click Finish. The Integration Services service has now been added as a clustered service.
10. Now you must modify the location and the content of the service configuration file so that both the configuration file and the package store are available to all nodes in the case of a failover. Locate the configuration file at %ProgramFiles%\Microsoft SQL Server\90\DTS\Binn\MsDtsSrvr.ini.xml. Copy it to the shared disk.
11. Create a new folder named Packages on the shared disk. Grant List Folders and Write permissions on the new folder to the built-in Users group.
12. Open the configuration file from the shared disk in a text or XML editor. Change the value of the ServerName element to the name of the virtual SQL Server which is in the same group.
13. Change the value of the StorePath element to the fully-qualified path of the Packages folder created on the shared disk in a previous step.
14. Update the value of HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\MSDTS\ServiceConfigFile in the Registry to the fully-qualified path and filename of the service configuration file on the shared disk.
15. Now bring the clustered Integration Services service online. In the Cluster Administrator, select the Integration Services service, right-click, and select Bring Online from the popup menu. The Integration Services service is now online as a clustered service.
Configuring Integration Services in a Different Group than SQL Server
In this configuration:
- The Integration Services Service may be online while the SQL Server 2005 SQL Server Database Engine is offline and the packages stored in its msdb database are unavailable.
- The Integration Services Service can be moved more quickly to another node if necessary.
- The Integration Services Service does not compete with other SQL Server services for CPU resources (when other SQL Server services are not installed on the same computer).
To configure Integration Services as a clustered service in a different group than SQL Server:
1. In the Cluster Administrator, select a group other than the group to which SQL Server belongs.
2. Add and configure IP Address, Network Name, and Generic Services resources for the Integration Services service. For the Generic Service, follow steps 1 through 9 of the previous procedure.
3. Place the service configuration file and the Packages folder on the shared disk for this group. Follow steps 10 through 14 of the previous procedure to update the location and content of the configuration file.
4. Bring the group online.
If you have trouble brining SSIS online on the 2nd node in the cluster, be sure it is up to the latest SP level:
Service pack install should be initiated on the node in the cluster that owns the SQL Server resource. ALWAYS REVIEW THE README for the given service pack prior to installing.
Also, be SURE TO RUN THE UPDATE ON ALL NODES IN THE CLUSTER FOR NON-CLUSTER-AWARE SERVICES SUCH AS client tools, SSIS, etc.
These step by steps are for SQL 2000 SP3 / SP3a...may differ some from pack to pack.
1. Remove connected users and make backups if applicable
2. Run the setup.bat file (or appropriate install for the service pack)
3. Click next in the welcome dialog
4. On the Computer Name dialog, enter the name of the SQL Server virtual server to which you are applying the service pack. You can only apply the service pack to one instance at a time. Click Next.
5. On the Connect To Server dialog, select the proper authentication method. Click Next.
6. If you do not have a password set for the SA user, the SA Password Warning dialog will display. You will be prompted to change it to a non-blank password. While you can leave it blank, for security reasons, you should not select this option.
7. On the SQL Server 2000 Service Pack 3/3a Setup dialog, select the options you need to upgrade or enable, and click Continue.
8. On the Error Reporting dialog, you can elect to send fatal error reports automatically to Microsoft. If you want to enable this, check the Automatically Send Fatal Error Reports To Microsoft check box, and click OK...I wouldn't enable it.
9. Setup will now gather information. If you have a database or databases that are read-only (or in NORECOVERY or STANDBY), you will see a message box stating they can't be updated. Click OK.
10. On the Remote Installation dialog, enter the username, password, and domain for the server cluster service account (NOT the SQL service account). Click Next.
11. The SQL Server service pack will now install. You'll see some messages, then a final message saying backup your master/msdb db's (which you should do).
12. You may get prompted to reboot. If so, be sure to reboot ALL nodes. I'd recommend rebooting all of them at this point if time permits anyhow.
13. After the reboot, backup your system db's and verify all is working properly.
1. "The checkpoint process can delay SQL Server database activity and does not yield Scheduler correctly causing Error: 17883 to occur" http://support.microsoft.com/kb/815056
2. "Not All Memory is Available When AWE is Enabled on a Computer Running 32-bit Version of SQL Server 2000 SP4"
Miscellaneous Tasks1. Configure tempdb concurrency enhancements for the system in question http://support.microsoft.com/default.aspx?scid=kb;en-us;328551. This applies to SQL 2005 as well as SQL 2000.
2. Increase tempdb to appropriate size and ensure to configure the size to be set statically so when the server restarts the tempdb doesn't start back at 1MB in size. Be sure to perform for both tempdb data and log files.
3. Configure PAE and AWE and mem options (Max, Min, Work set) in the OS and for SQL. See this kb for details: http://support.microsoft.com/default.aspx?scid=kb;en-us;274750&Product=sql2k. See this blog post for overview of /3gb, PAE, and AWE: http://blogs.msdn.com/chadboyd/archive/2007/03/24/pae-and-3gb-and-awe-oh-my.aspx
4. Reset default data/log paths
5. Enable advanced configuration options
6. Configure server options appropriately (use "ServerOptions.sql" script if available). This includes configuring:
a. SQL 2000
i. awe enabled
ii. cost threshold for parallelism
iii. Cross DB Ownership Chaining (should be 0)
iv. max degree of parallelism (should be # of physical procs at most)
v. max server memory
vi. min server memory
vii. remote access (recommend 0 for security)
viii. user options (I'd recommend 5496 if possible, which includes the default option settings required for use of indexed views)
b. SQL 2005
i. All from SQL 2000 list
ii. Affinity masks (if necessary...usually not)
iii. blocked process threshold (if you want to capture a blocking trace continuously)
iv. clr enabled (should be 0 by default)
v. Database Mail XPs (should be 0 by default)
vi. default trace enabled (on by default)
vii. Ole Automation Procedures (off, 0, by default)
viii. remote admin connections (off, 0, by default)
ix. Replication XPs (off, 0, by default)
x. SMO and DMO XPs (off, 0, by default)
xi. SQL Mail XPs (off, 0, by default)
xii. Web Assistant Procedures (off, 0, by default)
xiii. xp_cmdshell (off, 0, by default)
7. Configure default database options appropriately for system db's and model (use "DatabaseOptions.sql" script if available):
a. SQL 2000
i. ANSI_NULLS - ON
ii. ANSI_PADDING - ON
iii. ANSI_WARNINGS - ON
iv. ARITHABORT - ON
v. NUMERIC_ROUNDABORT - OFF
vi. CONCAT_NULL_YIELDS_NULL - ON
vii. QUOTED_IDENTIFIER - ON
viii. AUTO_CREATE_STATISTICS - ON
ix. AUTO_UPDATE_STATISTICS - ON
b. SQL 2005
i. All the SQL 2000 options
ii. AUTO_UPDATE_STATISTICS_ASYNC - ON
iii. TORN_PAGE_DETECTION - OFF
iv. PAGE_VERIFY - CHECKSUM
8. Configure security on the server (use "lockdown.sql" script if available).
a. SQL 2000
i. Windows Only Authentication (i.e. no mixed-mode)
ii. Set a strong SA password
iii. Set Audit Level (recommend at least 2 - failures)
iv. Sets Agent, MSDTC, MSSearch services off for auto-start
v. Disable Ad-hoc access to the server
vi. Drop sample databases (AdventureWorks, Northwind, Pubs)
vii. Revoke guest user access in the MSDB
viii. Set the number of error logs to keep around during recycle events
ix. Delete setup files from the system
x. Disable Cross Database Ownership Chaining
xi. Drop the BUILTIN\Administrators group from server
b. SQL 2005
i. All SQL 2000 checks
ii. Disable Sql Server VSS Writer service (if not needed)
iii. Disable Sql Server Browser service (if not needed)
iv. Disable CLR
v. Disable xp_cmdshell
vi. Disable Ole Automation procs
vii. Disable SQL Mail
viii. Disable DB Mail XPs
ix. Disable replication XPs
x. Disable Web Assistant XPs
xi. Disable remote admin connections
9. Configure SQL Agent properties:
a. Configure auto-restart options (should be disabled if clustered)
b. Define an Idle CPU condition
c. Configure logging history sizes and limits
10. For Sql 2000, configure appropriate error messages to the logged (so as to be able to capture via Agent Alerts) - run "messages.sql" script on the system (if available)
11. Add standard jobs/alerts/procedures (i.e. maintenance jobs, backup jobs, security traces, performance traces, debugging procedures, etc.)
12. Configure Reporting Services security and configurations if necessary
13. Reconfigure the model database to have appropriate configuration for new databases by default
14. Set the recovery model for the database(s) created
15. Create maintenance database on the server (if appropriate)
16. Consider -E startup switch (increases extent allocations to 4 consecutive per file on request instead of 1)
Mark "Affect the Group" option for Sql Server and/or Analysis Services resource
By default with Sql Server 2005, the "Affect the Group" option for the Sql Server and Analysis Services resources are not checked, which is different behavior from Sql 2000. Typically you'll want this option marked unless you have both the Sql Server database engine and Analysis Services engine running within the same group and don't want one to affect the other.
Adding Drives for SQL Server Use
During SQL Server setup, you can choose only one drive letter out of your available cluster drives. To be able to use additional drives, you must add them as dependencies to the SQL Server resource.
1. Start Cluster Administrator
2. Fail all disk resources that will be added to the SQL Server virtual server to the same node that currently owns the SQL Server resources.
3. Drag and drop the additional disk resources to the folder containing the SQL Server resources to move them. A message will be displayed confirming the move.
4. In the Move Resources dialog box, click Yes
5. Take the SQL Server resource offline by right-clicking it and selecting Bring Offline. Or, you can use SQL Server Service Manager and stop SQL Server. Both methods are supported.
6. Right-click the SQL Server resource and then click Properties.
7. In the Properties dialog box, click the Dependencies tab, and then click Modify.
8. In the Modify Dependencies dialog box, the available resources for the cluster appear in the Available Resources list. Select the drives to add, click the arrow to move the resource to the Dependencies list, and then click OK.
9. To verify that the resource is now a dependency, in the Properties dialog box click the Dependencies tab and ensure the disk(s) are listed.
10. Bring the SQL Server resource online when complete. Also restart SQL Server Agent and SQL Server Full-text if necessary.
Verifying the Drive Configuration
To ensure all of your drives are added to SQL Server properly, do the following:
1. Execute the following Transact-SQL query. The output should reflect all of the drives SQL Server can use
select * from ::fn_servershareddrives()
select * from sys.dm_io_cluster_shared_drives
2. Open SQL Server Enterprise Manager, and try to create a new database. All drives should be available for use.
Assigning a Static Port Number
If you added an alias to the nodes prior to installing the virtual SQL Server to use Named Pipes instead of TCP/IP and a port, change the aliases back to using TCP/IP and the appropriate static port now.
For each IP address, you should use a static port number to ensure that the port is always the same in a failover. To change the port number associated with each IP address of your instance, open the SQL Server Server Network Utility and assign the appropriate static port for use with TCP/IP for each appropriate virtual server instance. When finished, restart the SQL Server virtual service for the settings to take effect.
Enabling Advanced Security
If you need to use IPSec, Kerberos, or Secure Sockets Layer (SSL) with the failover cluster, it is definitely possible.
IPSec, although it technically can work in a clustered environment, is not really designed for that, so you might want to evaluate the effectiveness of IPSec in your cluster scenario. The underlying issue is that in a failover, the Internet Key Exchange Security Associations are not moved from one node to another. Also, by default, the Security Association Idle Timer times out in 5 minutes. This means that after a failover, applications or users accessing the cluster cannot reconnect until at least 5 minutes after all cluster resources are up after the failover.
For more information on IPSec with clustering, see Knowledge Base article 306677, "IPSec Is Not Designed for Failover".
Kerberos is supported on a server cluster and subsequently failover clustering under both Windows 2000 (with Windows 2000 Service Pack 3 or later) and Windows Server 2003. To see the steps to configure Kerberos on a Windows 2000 post Service Pack 3 server, read Knowledge Base article 235529, "Kerberos Support on Windows 2000 Based Server Clusters".
SSL certificates are fully supported on all versions of Windows 2000 and Windows Server 2003 clustering. To use SSL encryption on a SQL Server 2000 cluster, a certificate must be issued to the Virtual SQL Server Name.
Review Knowledge Base article 283794, "Problems Using Certificate with Virtual Name in Clustered SQL Servers".
If the process of configuring your certificates somehow goes awry, you might have to reinstall your failover cluster. To prevent this, back up all databases and nodes (Windows-level backup) prior to configuring SSL certificates for SQL Server.
For full instructions on how to enable SSL certificates for use with SQL Server, see Knowledge Base articles 276553, "HOW TO: Enable SSL Encryption for SQL Server 2000 with Certificate Server" and 316898, "HOW TO: Enable SSL Encryption for SQL Server 2000 with Microsoft Management Console".
There is also a Microsoft Support Webcast asp that might prove useful in configuring SSL certificates for your SQL Server instances at the following address:
Configuring SQL Server Resources
Once the failover cluster is installed, you might need to modify some of the parameters associated with the SQL Server resources. These include setting a preferred owner if you have more than two nodes, as well as your failover and failback policies.
Setting Preferred Owners
When you use more than two nodes in a failover cluster, it is important to consider which node should own the SQL Server processes in the event of a failover. The potential owners are configured with SQL Server Setup. With up to four nodes available under 32-bit and eight under 64-bit, there should be an order that makes logical sense for the production environment. You should set the failover preferences for the group containing all the resources for the instance of SQL Server (not only on the virtual server) to ensure that all resources properly fail over to the same node. For example, in an N + 1 configuration, each group would have the idle node second in the list of preferred owners. This means that if any of the nodes failed, the resources on that node would move to the idle node. To set preferred owners, follow these steps:
5. Start Cluster Administrator. Right-click the group containing the SQL Server 2000 virtual server, and then click Properties.
6. On the General tab, the Preferred Owners list box displays all cluster nodes that can potentially own the processes in that group, and the current order in which they would fail over. To change the order, click Modify.
7. In the Modify Preferred Owners dialog box, make any changes to the preferred failover order. All nodes currently configured as potential owners appear in the right pane in the order of failover preference.
Resource Group Failback
All resources fail over to another node at the group level. In the event of a failover, the cluster group containing the SQL Server resources can be configured to fail back to the primary node when and if it becomes available again. By default, the Prevent Failback option is not selected because usually there is no problem with continuing on the secondary node when you have properly planned all of your resources. This setting provides an opportunity to analyze and repair the problem on the failed node. If you need to move the resources, you can do it manually later. This setting should not be changed, but if you need to, here is how:
1. Start Cluster Administrator. Right-click the group containing the SQL Server 2000 virtual server, and then click Properties.
2. In the Properties dialog box, click the Failback tab
3. To prevent an automatic failback, select Prevent Failback. To allow automatic failback, select Allow Failback, and then one of the following options:
a. Immediately - this means that the moment Windows Clustering detects that the preferred cluster node is online, it fails back any resources. This is not advisable because it could disrupt clients and applications, especially at peak times in the business day.
b. Failback Between N And N1 Hours - this option allows a controlled failback to a preferred node (if it is online) during a certain period. The hours are set using numbers from 0 through 23.
Configuring Resource Parameters
Failovers can also be controlled in terms of a threshold, meaning that after a certain point, a resource is not able to fail over to another node. There are two levels of thresholds: resource and cluster. Depending on how the resource is configured, it can affect the group failing over to another node. To configure resource parameters, follow these steps:
If you are not using a resource, such as the clustered full-text resource for each SQL Server virtual server, clear the Affect The Group check box. Do not change the Affect The Group status of data or log disks, the SQL Server IP address or network name, or SQL Server Agent and SQL Server itself.
1. Start Cluster Administrator. Select the proper group containing the SQL Server 2000 virtual server, then right-click the resource to alter, and click Properties.
2. In the Properties dialog box, click the Advanced tab
3. Select Do Not Restart if the Cluster Service should not attempt to restart or allow the resource to fail. By default, Restart is selected. If Restart is selected, configure the restart policy:
a. Affect The Group - to prevent the failure of the selected resource from causing the SQL Server group to fail over after the specified number of retries (Threshold) has occurred, you should clear the Affect The Group check box.
b. Threshold - this is the number of times the Cluster Service will try to restart the resource, and Period is the amount of time (in seconds) between retries. For example, if Threshold is set to 0, and the Affect The Group check box is selected, on detection of a failure the entire group with the resource is failed over to another node. Do not modify Threshold.
4. Do not ever modify the Looks Alive Poll Interval and Is Alive Poll Interval settings. These settings are configured to be optimal for the specific application, which in this case, is SQL Server. They are used to poll nodes in the cluster for health and rejoining on failure.
5. Do not modify Pending Timeout. The value, represented in seconds, is the amount of time the resource in either the Offline Pending or Online Pending states has to resolve its status before the Cluster Service puts the resource in either Offline or Failed status.
6. Click OK.
Cluster Group Thresholds
Not unlike a resource's thresholds, you can configure thresholds at a group level to tell the server cluster how many times to try to restart the group on one node before attempting to fail the group over to another node in the server cluster.
1. Start Cluster Administrator. Right-click the group containing the SQL Server 2000 virtual server, and then click Properties.
2. In the Properties dialog box, click the Failover tab
3. To configure the failover policy, in the Threshold box, enter the number of times the group is allowed to fail over within a set span of hours. In the Period box, enter the set span of hours. For example, if Threshold is set to 10 and Period is set to 6, the Cluster Service fails the group over a maximum of 10 times in a 6-hour period. At the 11th failover in that 6-hour period, the server cluster leaves the group offline. This affects only resources that were failed over; therefore, if the SQL Server resource failed 11 times, it would be left offline, but the IP could be left online.
Capture MPS reports and store with cluster documentation for each of the nodes in the cluster. Capture each of the following MPS reporting data:
- Cluster MPS Report
- Sql MPS Report
- Setup/Perf MPS Report
Verifying Connectivity and Name Resolution
To verify that the private and public networks are communicating properly, perform the following procedures:
1. From each node in the cluster, ping each of the IP addresses of each of the SQL Virtual Servers
2. From each node in the cluster, ping each of the network names of each of the SQL Virtual Servers
3. Repeat steps 1 and 2 for client computers that need access to the service
Need to perform a failover of all SQL Server virtual servers to all nodes of the server cluster to ensure that all resources fail over and restart on each node without problems and without affecting any other groups.
1. Start Cluster Administrator
2. Right-click a group containing the SQL Server resources you are testing, and then click Move. If you have more than two nodes, select the node to move the group to. The group selected and its resources should then be moved to its preferred failover node
3. Ensure the drives, event logs, cluster logs, etc. can be accessed and/or contain no errors
4. Ensure clients can connect to the service
Verifying the SQL Server Service Account and Node Participation
For SQL Server to be able to manage its resources and perform correctly, the service account must be part of the cluster access control list (ACL). To ensure that this is configured properly, execute the following in a SQL Query Analyzer:
select * from ::fn_virtualservernodes()
The output should reflect the nodes configured. If there is no output, ensure the account SQL Server is running under is part of the cluster ACL. This query can also be used to verify which nodes are part of the SQL Server virtual server definition.
FINALLY, at this point the "checklist_SqlServer_postInstall.doc" checklist should be FINISHED!!!
Last Update: 6/28/2008
About the author
View all my tips