Keeping Track of your SQL Servers with a DBA Configuration Repository
SQL Server DBAs are often responsible for managing large number of database instances. Busy 24/7 operational DBA teams may look after hundreds of instances with widely different characteristics. As systems get added, modified or decommissioned from the fleet, database administrators need to keep track of everything they manage. This article talks about the concept of a Configuration Repository where DBAs can manage all the information related to their SQL Servers.
In larger organizations (often with dedicated DBA teams), the database server landscape is often quite colorful. To start with, the number of platforms in use can be a mixture from all major and little-known vendors. If SQL Server is a large part of that landscape, there could be a variety of scenarios too:
- Top tier servers (like heavy-traffic, large OLTP systems) against smaller, less-known systems (perhaps hosting one or two legacy reports)
- A mixture of the organization's own internal systems (HR/Payroll) and those belonging to external customers (joint ventures, hosting service etc.)
- Systems located locally and systems spread across the globe.
Despite the variance, the task of a DBA remains the same: run each system with peak efficiency. And for that, the first thing the DBAs (or the DBA team) need to do is to understand exactly what systems are under their care. They need to know what's running in the network and with what configuration.
The concept of keeping track of servers isn't something new. Your IT team can already be using a CMDB (Configuration Management Database) to keep track of every piece of equipment in the environment. However, traditional CMDBs often don't have the level of granularity needed by the system administrator. It could very well say what rack and chassis a particular server belongs to and who's the owner, but it won't say if an out-of-support service pack is running there.
What the DBA then really needs is a special kind of repository. Something that would hold only SQL Server specific information and offer a quick navigation mechanism. As new servers are rolled out, records will be added to that list. As SQL configurations are changed, relevant records would be modified. When systems are decommissioned, corresponding rows would also be updated. The only users of this special kind repository would be the DBA team and its custodian would be the lead DBA.
I would like to call such a system a DBA Configuration Repository. I won't go into the details of physical implementation of such a system, but talk about the various pieces of information you can collect and put in there. In the following sections, I have divided these information into different categories.
Structure of the Configuration Repository
General Purpose Information
These are pieces of information that tells you what the SQL Server is used for, where does it fit in the network, how important it is for support purposes and so on.
- Instance Name: This one is obvious but the most important piece. What's the name of the instance you are managing? For a named instance, use the full server and instance name convention (server_name\instance_name). For default instances, use the machine name. If a server has multiple instances, list all the instances.
- Status: Another basic piece. Is this server live and in operation? Has this been rolled out but yet to be commissioned formally? Is it marked for decommissioning or has this been shut down completely?
- Purpose: A very short description of what this server is used for. Ideally this will be a one-liner with as few words as possible (e.g. "Microsoft Project Server Back-end" or "Data Warehouse Reporting Server")
- Usage: Is this a production or non-production system? If not production then what's the usage: a development box for the programmers? System Integration Testing (SIT) / User Acceptance Testing (UAT) / Quality Assurance (QA) or Staging Server?
- Hosted Application: What systems or applications interface to this database server? Is it only one application or multiple applications? Examples can be a dedicated database server for a SharePoint farm as opposed to a shared instance hosting multiple client databases.
A further breakdown of the hosted application field can be done with a secondary field that goes into specifics. An ERP system can have multiple modules; the particular SQL Server instance at your hand could be hosting only one module of that ERP package. The main Hosted Application field may contain "ABC ERP System" and the second categorization can provide further details like "General Ledger Module".
- Internal vs. External: Is this SQL Server hosting data for the organization's own internal systems or is it hosting data for clients, partner or suppliers (e.g. cloud service)?
- Criticality: Is this a Tier 1 server (e.g. billing system) or a non-critical environment?
- Geographical Location: Where is the server located? For a company with nation-wide business, the default location could be the dedicated data center. For a global bank, the server can be in a different country / time zone.
Technical Configuration Information
These attributes relate to technical architecture.
- Physical Configuration: Is this a physical server or a Virtual Machine (VM)? Is this a node of a cluster? Is this a DR server (cold or hot stand-by) for Always-On, Mirroring or Log Shipping?
- IP Address: What's the IP address of the system? This can be a necessary piece of information as applications often have addresses hard-coded within programs and scripts; aliases, linked servers and ODBC connection could also be using the IP instead of server name. If the system goes through a migration phase, all those moving parts would need to be checked and updated. The server administrators can update the DNS for redirection, but you would still want to keep track of the old IP address for possible troubleshooting purposes.
- Logical Location: Is this server part of an Active Directory Domain within the organization? Is this in a workgroup? Is this located in the DMZ (e.g. extranet servers)? What's the domain or workgroup name? Has the server been "outsourced" to a cloud service provider and your access is only through Management Studio?
- Access Method: What's the preferred / usual access method for the server? Do you have to use a special account created for the DBA or can you use your Windows account? Is it accessible through regular Windows RDP or you need to use third-party tools like VNC or need Citrix access? Do you have to go through any hoops like logging into a trusted server first and then access the target machine? What's the name / address of that trusted machine in the middle?
- Windows Version: What's the version of the underlying operating system? Is it still Windows 2000 running a SQL 2000 database? What's the service pack level of Windows? You can get the same information from the CMDB or server administrators, but you would want to keep it all in one place for answering OS related questions.
- Storage Characteristics: Is this server using an external shared storage like a SAN for all its drives or is it a mixture of local drives and SAN storage?
- Version: What version of SQL Server is it? 2000? 2005? 2008 / R2 or 2012?
- Edition: What's the edition? Are we running Express Edition in a production system or a Developer Edition? Is it Standard, Enterprise or Data Centre?
- Service Pack: What's the service pack level?
- Patch Level: Was there any Cumulative Update, Hotfixes applied to the system? What are those?
- Components: What components of SQL Server are running in the machine? Is it a vanilla installation of SQL, SSIS and Full-text daemons only? Or is it running SSAS too? Is the server hosting only a stand-alone version of Reporting Services?
- Service Account: What's the service accounts used for each SQL Server component? Is this a dedicated domain account or the "Local System" account? For a domain / workgroup account, what's the password?
- sa Account: If the server is using mixed mode authentication, what's the "sa" password?
- Other Special Accounts: Are there any other privileged accounts used in the machine that have access to SQL? Do you have the passwords?
Risk Management Information
The following attributes define if the system is protected well enough.
- Disaster Recovery: If the server is a top tier system with higher exposure to risk, is there a DR system available?
- What's the DR mechanism? Log Shipping, VMWare Site recovery? Always-On?
- What's the DR / failback server name?
- Has there ever been a DR testing conducted? If so, when was the last time?
- Backup: What's the story with SQL Server Backup?
- Are the databases being backed up at all?
- If backed up, are they through
- SQL native backup (scripts, maintenance plan jobs)
- Third-party tools (e.g. NetApps SnapManager, Symantec NetBackup, EMC Avamar)
- If backed up:
- Have the backups ever been tested for restored? When was the last time the backups were successfully restored?
- If backed up:
- How often the databases are fully backed up? How long do you have the full backups accessible?
- How often the "incremental" (differential, transaction log, filegroup) backups happen? How long do you have the incremental accessible?
- Are the backups being protected (tapes, off-site storage etc.)?
- Monitoring: Is the server being monitored for availability, capacity and performance?
- Availability: Do you get an alert (e-mail or SMS) when the server goes offline? Do you get similar alerts when SQL Services are offline?
- Capacity: Are the disk spaces, memory, network bandwidth and CPU being monitored in real time? Do you get alerts and warnings when disk quotas fall below a certain threshold%?
- Performance: Do you have access to performance reports for the server? Is there any trend report available?
Support and Governance Information
These are pieces of information that help you find people or teams who can help you with the server and who you probably need to report to. This information may also be available from a corporate CMDB.
- Business Owner: Who is the ultimate "customer" for this SQL Server? Does this SQL Server instance host external client databases (e.g. hosting services and ISPs) or is it hosting a departmental database (marketing or engineering workgroups)? For organization's internal systems, identify the business teams and departments that have the most input / concern / influence over the server.
- Business Owner Representative: Who is the person you talk to from the customer side? If it is an external customer, what's their contact detail?
- Technical Owner: Which other teams are responsible for the system? In most cases it will include a server infrastructure team. Other stakeholders can be Help Desk, customer's own IT department, external vendor etc.
- Technical Owner Representative: Who is the other technical guy who can help you with the server? How can you reach him?
The support and governance information is important because when it comes down to maintenance, support, installations, upgrades or outages, you will need to work with each of these groups or individuals to secure authorization, arrange change windows, avail technical support and so on. Effectively this tells you who is depending on your service as a DBA and who you are depending on.
Creating the DBA Repository
Depending on how you go about it, there can be a lot of information to capture from your servers and various logical pieces can then translate into multiple fields in a database. The actual implementation can itself be the subject of one or more lengthy articles, so I won't get into the details here. Without being pedantic, I can say I have seen a few examples like:
- Using the Registry to hold the DBA information (custom keys and sub-keys), although I don't think it's a good approach..
- Excel spreadsheet (perhaps the easiest and most widely-used)
- Custom web application where the backend is itself a database and the data modification happens through an online form.
I personally used a mixture of Excel spreadsheet, a bit of SSIS data load into SQL Server back-end tables and then publishing through Reporting Service. But then again, I created an Excel-only solution later.
Whatever method, you need to spend some time designing the repository's physical data structures.
Populating the Repository
Once you have the physical database set up, it's time to populate it with data. Populating the repository for the first time is always the hardest part. That's because if you have never documented your systems, you really don't know what's out there. Other DBAs before you may have left the organization without leaving any proper documentation; even if they did document their servers, people may not know where they are. Most DBAs would have some sort of list of the systems they manage. If you can get hold of such lists, start from there. You can consolidate all the files you can gather, go through each of them and verify the accuracy of the information. This should help you see which servers are still online. You will need to look into each server manually and collect some of the information. It can be a fairly tedious process but at the end it's all worth it.
The main thing to remember is that this is an iterative process and for some systems, you may not have all the information you would like to gather. As you fill the blanks gradually, start documenting the servers you are rolling out yourself. Start asking your peer DBAs to let you know about systems they are managing or have rolled out in the past.
Manual processes apart, you can use automated discovery tools like Microsoft Assessment and Planning (MAP) Toolkit to query the Active Directory for SQL Servers running in your network. A detailed discussion of the MAP Toolkit is out of scope of this article.
Ownership of the Repository
It's important to create a control mechanism from the very beginning to keep the Repository relevant and error-free. By control mechanism, I mean the following:
- There should be one and only one version of the config repository available and ideally it should be in a network accessible location. This would eliminate the possibility of different team members having different versions of the same repository.
- There should be a process in place where only one (or at the most two) person(s) can update this repository. This custodian can be the DBA team lead or the senior DBA within the team. For a one-man DBA operation, it will be the DBA himself. This approach will eliminate any duplication or erroneous deletion of data.
- The custodian of the DBA Configuration Repository (itself a DBA job) would ensure maximum authenticity of the data. He would be reviewing the repository on a regular basis to identify any missing pieces, errors, omissions and duplicates. He would be ensuring the repository is read-only for anyone but him (and any trusted team member) and any sensitive information (passwords etc.) are masked.
- There should be a process within the DBA team where any system changes are reported to the custodian. This includes new server rollouts, server migrations and retirements, service pack upgrades, password changes and so on.
- New DBAs who join the team needs to have their orientation about the Configuration Repository, its usage and the associated processes. Existing DBAs should be familiarized with its operation and the need for its accuracy.
- The Configuration Repository needs to be backed up regularly.
What's the Use?
So now you have implemented a DBA Configuration Repository after weeks (or even months) of effort, managed to audit every single server instance you know of and populated the system with relevant data. Your team members (and even management) support you with the process. You know you have done it and have achieved something. But what's the use? Where does it all fit in? Well consider the following scenarios:
Scenario 1: A new DBA has just joined the team and he needs to get running quickly. You can familiarize him with the repository and give him a list of servers to look after. No need to think - just the ones you think are "low risk".
Scenario 2: An IT "risk assessment" initiative just started last week. Your manager has asked you to "engage with the business" and identify systems that pose "significant challenges in manageability". Simple. Filter through your repository, see which important servers are running out-of-support SQL Servers or service packs. See which production systems have never been backed up, which tier 1 systems have never been tested for a DR or which servers don't seem to have a technical owner listed. If you and your team have done your job, there should not be too many entries in the report. This also gives you a chance to get the house in order.
Scenario 3: You know the old production billing server is dying. It's a physical server and you want the server team to build a VM for it so you can migrate the databases over a weekend. Who do you approach for a formal approval?
Scenario 4: It's year-end license auditing time and the IT Finance team has asked you for a list of SQL Servers that have been rolled out in last year.
Scenario 5: Developers in the product team have just found that a bug in the code can be fixed if the latest service pack is applied to the Analysis Service database. They want you to tell them if the QA server is also running the same SP.
So you see, I can go on an on. The main purpose of the repository is to assist you in identifying issues before they are reported, working more efficiently within the DBA team, minimize unnecessary delays in support activities, having the right information at your fingertips at the right moment and so on. The better you design and populate your DBA Configuration Repository, the better results you will achieve.
- Learn more about Microsoft Assessment and Planning Toolkit
- Learn more about Microsoft Baseline Security Analyzer
- Learn more about Microsoft Best Practices Analyzer
- Learn more about network monitoring tool like Nagios, Zabbix, HP OpenView etc.
- Explore any CMDB used by your IT organization to understand the type of information captured there
- Retrieve and consolidate all configuration related documents you may have inherited as a DBA and verify their accuracy, cleanse their data and add more fields as necessary.
- Implement a business process within your DBA team to keep the Configuration Repository up-to-date.
About the author
View all my tips