By: Alejandro Cobar | Comments | Related: > Availability Groups
Problem
During a recent SQL Server instance upgrade, I was supposed to upgrade the SQL Server version of a particular instance from 2012 to 2016 which contains an Availability Group. As a pre-step to my upgrade process, I wanted to change the failover mode of all replicas to "Manual" instead of "Automatic" (for my replicas using "Synchronous Commit" for their availability mode) to avoid any automatic failover surprises during the upgrade. After attempting to apply the property modification for my Availability Group, SSMS showed a weird error message stating that the operation couldn't take place because of a missing DLL file "sqlconf.dll". This is the first time I experienced this and I'd like to share how I was able to fix it.
Solution
In this tip I will demonstrate how to reproduce the error (specifically for SQL Server 2012), as well as the steps that led me to the solution along with some assumptions and results for different SQL Server versions. The goal of this article is to give some specific knowledge that could potentially save a lot of time if this happens to you.
Initial considerations for test scenario
Below is a screenshot that shows a very basic pair of replicas (1 Primary and 1 Secondary) running SQL Server 2012 SP4:
Both replicas have a single database called "test" which is fully synchronized and the AG is operating under a synchronous commit (with automatic failover mode).
As you can see, it is pretty simple. Note that everything is running on Windows 2012 R2 VMs.
Here's another view of what I currently have in my Primary Replica:
Steps to reproduce the issue
Have a healthy Availability Group properly configured and synched between at least 2 replicas.
In the server where the Primary Replica is hosted, either delete or move the entire "Setup Bootstrap" directory. Since I'm dealing with a SQL Server 2012 setup, the path will be "C:\Program Files\Microsoft SQL Server\110\Setup Bootstrap".
Attempt to modify the failover mode to manual for both replicas.
This is the error message that SQL Server Management Studio presents:
Failed to create the availability group. A SQL Server instance name could not be validated because the dynamic link library (DLL) file 'sqlconf.dll' could not be located (Windows System Error 126). Verify that the specified server instance exists. If it exists, the DLL file might be missing from the server instance. (Microsoft SQL Server, Error: 35246)
Interesting facts
- Even though the operation that's taking place is an "ALTER AVAILABILITY GROUP" command, it clearly says "Failed to create the availability group". Perhaps this is a misleading/generic text that Microsoft decided to use here (or forgot to update it).
- I wasn't able to find a solution anywhere for the specific SQL Server Error number: 35246. At most, you'll only find sites listing all the error codes for Availability Groups with their respective text message, but nothing on how to address them.
- When I got this error in my production environment, I noticed that the exact same thing happened for all my SQL Server 2012 instances, but not for the SQL Server 2014/2016 ones.
- I noticed that failovers can take place without any issues, even if the sqlconf.dll is missing.
- I noticed that the instance can be restarted without any issues.
Course of action to address the issue
After getting the error and finding no useful information about the error, I decided to postpone the SQL Server upgrade and opened a case with Microsoft.
*My production setup consists of 2 replicas under synchronous commit with automatic failover (within the same data center), and 1 replica under asynchronous commit with manual failover (in a different data center).
Here's a summary of what Microsoft advised:
Check: Check that the following directory exists: C:\Program Files\Microsoft SQL Server\110\Setup Bootstrap\SQLServer2012\x64
Outcome: The "Setup Bootstrap" directory was missing from my 2 replicas under synchronous commit.
Check: Run the TSS script (they provided) in the server housing the Primary Replica and send the logs for analysis.
Outcome: I didn't want to execute it because the production cluster houses several instances for several clients and the number of logs that it collects is quite large.
I noticed that in the secondary replica under asynchronous commit, the Setup Bootstrap directory was there.
- Therefore, I asked Microsoft if it would work if I copy the entire directory to the other 2 replicas, to which they simply said that they didn't recommend that because there might be some dependencies with other files/directories that they just can't confirm 100%.
- Their recommendation was to run an instance repair to rebuild the missing "Setup Bootstrap" directory. However, I didn't want to take that route because it would require some serious effort to address the issue.
Since I decided to postpone the SQL Server upgrade activity, I thought it would be better to attempt to reproduce the issue in a test environment and see if I would be safe just by copying the files from one replica to another.
In my test environment I copied the "Setup Bootstrap" directory from the Secondary Replica to the Primary Replica (where I intentionally removed it).
The following screenshot shows the exact location of the sqlconf.dll file that is part of the error message:
For my test scenario, it would've worked the same if I just deleted/misplaced the .dll file instead of touching the whole "Setup Bootstrap" directory. However, I wanted to be sure the scenario would resemble the one in my production environment as much as possible.
Here's the result after copying the missing files and re-attempting to change the failover mode to "Manual" for both replicas:
As you can see, the error didn't show up this time and the modification was successfully applied.
Interesting facts on different SQL Server versions
I attempted to reproduce the exact same thing using a pair of SQL Server 2016 instances (SP2 CU15) and I didn't get the error (even with the entire "Setup Bootstrap" directory missing).
This gives me the impression that perhaps Microsoft doesn't require the sqlconf.dll anymore to change the failover mode property.
Next Steps
- With these positive results I went on and successfully performed the SQL Server upgrade in production with no issues and my instances worked fine under SQL Server 2016. I'm guessing that I would've run into additional issues if I tried to either install new features or remove an existing feature in any of my SQL Server 2012 deployments (I am lucky that I was able to jump into a SQL 2016 boat safely).
- Somebody probably manually deleted that directory in my production environment. This tells me that it is something I should monitor. I have an idea of a PowerShell script that can help any DBA get alerted when something like this happens in their respective SQL Server Cluster environments (at least if they are running SQL Server 2012 of course), so stay tuned!
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips