Why listing all of your SQL Servers does not always work
When I do a SQLCMD -L or OSQL -L or I hit the drop down on the server list to connect using the GUI tools (like SQL Server Management Studio), I only see some of the SQL Servers in my environment. I don't see them all. Why is that?
In order to understand why you don't see all the servers, we have to delve a bit into networking, specifically with respect to the User Datagram Protocol (UDP) because this is what those tools use to try and discover the SQL Servers that are out there.
First a bit about UDP...
UDP is one of the protocols in the TCP/IP stack of protocols. It has several interesting characteristics:
- It is considered connection-less (unlike the Transport Control Protocol, aka TCP).
- UDP does use ports, just like TCP, but unlike some other protocols this allows multiple processes to be using UDP at the same time.
- Since it's connection-less, any connection-like properties between two systems must be handled by the application.
- It's a fairly small protocol and therefore takes up relatively little bandwidth on the wire.
When it comes to SQL Server, UDP is used in two ways:
- To discover other SQL Servers that are out there.
- To discover how to connect to a named instance.
In both cases, a connection is sent via UDP and specifically to a port, which is 1434. But they differ in to whom they are sent. In the first case, the client does a broadcast. It sends to a particular IP address that the network recognizes as meaning the communication should go to all hosts on that network. I'm simplifying quite a bit, but that's the general idea. In the second case, when it's trying to determine how to connect, it actually sends to the specific host where the named instance is running on. Let's focus in on the first case from here on out.
So the client (like sqlcmd or osql) sends out that broadcast to port 1434. If you were to use a network analyzer like Wireshark (warning: do not do this in a corporate environment unless you have explicit permission in writing, as this could be considered hacking), you would see something like this when the client does its broadcast:
Note the destination IP address (255.255.255.255). That tells the network this is a broadcast and note the destination port (1434) as well as the protocol (User Datagram Protocol). So the client sends this out and waits for answers back from various SQL Servers. If it doesn't get a reply back, it doesn't list the server.
There are several reasons why it wouldn't be able to get a reply back.
- The UDP communications gets "lost" on the way.
- Communications via UDP port 1434 are blocked on the network.
- The SQL Server is not configured to use any network libraries other than "Shared Memory" or it does not allow remote connections.
- It's a SQL Server 2000 instance where Hide Server is configured.
- There are only SQL Server 2000 instances on the server and none are started.
- It's a SQL Server 2005/2008 instance with the SQL Server Browser service not running.
- Communications via UDP port 1434 are blocked by a firewall, either on the client computer or the server.
- Communications via UDP port 1434 are blocked by an IPSEC policy, either on the client computer or the server.
Let's talk about each of these.
Lost on the Wire:
UDP is a connection-less protocol. Therefore, if the communications gets lost on the way, the protocol itself has no mechanism to recover. With respect to how SQL Server or its clients use UDP, it doesn't try and recover either. A way of thinking about UDP is like a postcard. You send it and usually it gets to the destination. However, there is no guarantee. If you want a guarantee, then you'll have to do something else like phone the person who should receive it and make sure they got it. If they didn't, then you'll have to resend it. SQL Server and its clients do none of that. So if the communications gets lost, it stays lost.
UDP on port 1434 is blocked:
This comprises #2, #6, and #7. Once upon a time there was a nasty worm known as SQL Slammer. It took advantage of a vulnerability in the SQL Server Listener Service (that part of SQL Server 2000 that listens for the UDP port 1434 communications) and set up the system to basically flood the network with additional UDP port 1434 communications which allowed itself to spread. In reaction to this, some ISPs and some corporate networks outright blocked UDP on port 1434 communications altogether. That means you have to know exactly where the SQL Servers are and how to communicate with them to connect. In some cases, this has been applied via various mechanisms on the operating system, too.
The SQL Server doesn't allow remote connections or is hidden:
This comprises #3 and #4. If a SQL Server doesn't allow remote connections, it doesn't have anything to tell that requesting client about. The information passed back is how to connect. Obviously, if you can't connect, there's no reason to send anything back. Another case is with SQL Server 2000 servers where in the Server Network Connection configuration, the instance has been configured to be hidden.
Obviously, in this case, the server is not being hidden, meaning it will respond unless something else gets in the way.
SQL Server 2000 / SQL Server Browser Service Isn't Started:
And this finishes things up, explaining reasons #5 and #6. In SQL Server 2000, whatever instance started up first started the SQL Server Listener Service, which is not a service we really have any control over. In SQL Server 2005/2008, that has been broken out into a controllable service aptly called the SQL Server Browser service. Since it is these two services which respond, if they are down, then the client won't hear anything back from them, meaning the instances residing on that server won't be identified. In the case of SQL Server 2000, all the SQL Server 2000 instances would have to be stopped for the listener not to be up. In SQL Server 2005 or 2008, the SQL Browser service could simply be stopped.
So how do I discover all the SQL Servers in my environment?
In this case you need a tool. There are two great free ones:
These use all known methods to discover the SQL Servers that are installed. This includes checking for services on a system, trying to check based on the network, as well as looking through the registry on the system. As a result, running these tools will produce better results if you're using an account that has permissions over all the systems to be scanned. If you're a DBA or auditor trying to do this, you likely don't have the rights. You may have to enlist someone from the server side to run the tool for you.
- Take a look at some of these other SQL Server Configurations tips
About the author
View all my tips