SQL Server 2019 DAG WFC - Manual Failover won't work (MSSQL Error 41131)
0
votes
2
answers
1714
views
We set up a distributed failover cluster with 2 Windows Server 2019 Datacenter nodes, each of them running SQL Server 2019 Enterprise + SSMS18.
- The two nodes are located in two different sites with two different IP-Subnets.
- Each Host is a ESXI VM with only one NIC (Host A in Subnet A, Host B in Subnet B).
- Both sites are connected via a S2S-VPN Connection and routing possibilities for traffic between.
**Problem**
We double checked every possible problem, but we cannot get managed, to manual failover an AvailabilityGroup with a synchronized DB via SSMS
- **Instance** -> Always On High Availability -> Availability Groups -> -> Right-Click "Failover"
- SQL Server error 41131 (see attachment)
**Troubleshooting**
* Connection between hosts is up and the "dashboard" shows, that both hosts are communicating, up and synchronized.
* Defender Firewall rules are there for the DAG-listeners, the Agent, the Browser service. On a PaloAlto Firewall at site A, traffic can be detected between both SQL hosts, but no traffic is denied.
* Both hosts run via a separate service user for SQL Server Agent and SQL Server engine, so there should not be any trouble with missing rights for the
, but was not able to find any useful information online, since the only connected solution is always to change rights for the NT Account, which we do not use for Agent or Engine.
NT Authority\SYSTEM
.
Rights to the AD-Clusterobject are there, to create and update any child objects. Two DNS entries for the listener and one for the cluster object are also there after the creation.
Even the automatic seeding between both hosts is working, only the failover through SMSS18 is failing (inserted rows replicate from host A to host B).
**Questions**
Are there any ideas, at which point we can troubleshoot?
I attached the 
Asked by DevDino
(1 rep)
Jul 28, 2022, 06:50 AM
Last activity: Aug 11, 2022, 09:01 AM
Last activity: Aug 11, 2022, 09:01 AM