Sample Header Ad - 728x90

Always On Availability Group Failover Issue

1 vote
2 answers
4643 views
I'm testing a PowerShell Disaster Recovery script and trying to fail over to an off-site AG DR (asynch) replica with this command ALTER AVAILABILITY GROUP [MyAG] FORCE_FAILOVER_ALLOW_DATA_LOSS On several clusters this works fine but on a couple of other ones it usually fails (but not always) with the error: > Failed to move a Windows Server Failover Clustering (WSFC) group to > the local node (Error code 5023). The WSFC service may not be running > or may not be accessible in its current state, or the specified > cluster group or node handle is invalid. For information about this > error code, see "System Error Codes" in the Windows Development > documentation. > Failed to designate the local availability replica of > availability group 'MyAG' as the primary replica. The operation > encountered SQL Server error 41018 and has been terminated. Check the > preceding error and the SQL Server error log for more details about > the error and corrective actions. We have to force the quorum simulating the situation where the async DR replica can't communicate with the other replicas: https://learn.microsoft.com/en-us/sql/sql-server/failover-clusters/windows/force-a-wsfc-cluster-to-start-without-a-quorum?view=sql-server-2017 A comparison of the clusters that work vs. those that don't with Get-Cluster and Get-ClusterGroup doesn't reveal any major differences. This failover has worked successfully on some clusters but fails 99% of the time on two of the clusters. It has worked on these two clusters, but rarely. One of the common issues I found from a Google search is permissions for NT Authority\System (see https://dataginger.com/2014/10/28/sql-server-failed-to-bring-availability-group-availability-group-name-online/) , which I confirmed it has. I ran a trace and see the same errors and no other clues in the trace. If nothing turns up I may have to open a Microsoft ticket to engage their support. Any thoughts? Thanks for your help!
Asked by Mike Petrak (436 rep)
Mar 30, 2019, 08:52 PM
Last activity: Jan 8, 2025, 12:08 AM