Sample Header Ad - 728x90

Issue with MariaDB Master-Slave Cluster Failover in MaxScale When Multiple Pods Are Deleted

0 votes
0 answers
63 views
**Issue with MariaDB Master-Slave Cluster Failover in MaxScale When Multiple Pods Are Deleted** Hello everyone, I’m running a MariaDB master-slave-slave setup with MaxScale for automatic failover. The cluster consists of one master and two slaves. before deleting 2 pods Everything works as expected under normal conditions, but I encountered a problem when I deleted two of the MariaDB pods simultaneously (e.g., mariadb-sts-0 and mariadb-sts-1). MaxScale initially promotes the remaining slave (server3) to the primary role, but once the other two pods come back online, the cluster ends up without a primary. It seems MaxScale is trying to re-promote one of the original servers as the primary, but the process doesn't complete properly. after deleting 2 pods My question is: - How can I configure MaxScale to handle this scenario better? - Specifically, I want the third server to reliably take over as the primary when the first two are deleted, and ensure it remains the primary when they come back online, until manual intervention or an automatic rebalancing. Here are the relevant logs from MaxScale and my current configuration details:
2024-10-11 01:48:22.378   error  : (log_connect_error): Monitor was unable to connect to server server1[mariadb-sts-0.mariadb-service.kaizen.svc.cluster.local:3306] : 'Unknown server host 'mariadb-sts-0.mariadb-service.kaizen.svc.cluster.local' (-2)'
2024-10-11 01:48:22.382   notice : (log_state_change): Server changed state: server1[mariadb-sts-0.mariadb-service.kaizen.svc.cluster.local:3306]: master_down. [Master, Running] -> [Down]
2024-10-11 01:48:22.382   warning: [mariadbmon] (handle_auto_failover): Primary has failed. If primary does not return in 1 monitor tick(s), failover begins.
2024-10-11 01:48:23.385   error  : (log_connect_error): Monitor was unable to connect to server server2[mariadb-sts-1.mariadb-service.kaizen.svc.cluster.local:3306] : 'Unknown server host 'mariadb-sts-1.mariadb-service.kaizen.svc.cluster.local' (-2)'
2024-10-11 01:48:23.385   notice : (log_state_change): Server changed state: server2[mariadb-sts-1.mariadb-service.kaizen.svc.cluster.local:3306]: slave_down. [Slave, Running] -> [Down]
2024-10-11 01:48:23.385   notice : [mariadbmon] (select_promotion_target): Selecting a server to promote and replace 'server1'. Candidates are: 'server2', 'server3'.
2024-10-11 01:48:23.386   warning: [mariadbmon] (select_promotion_target): Some servers were disqualified for promotion:\n'server2' cannot be selected because it is down or in maintenance.
2024-10-11 01:48:23.386   notice : [mariadbmon] (select_promotion_target): Selected 'server3'.
2024-10-11 01:48:23.386   notice : [mariadbmon] (handle_auto_failover): Performing automatic failover to replace failed primary 'server1'.
2024-10-11 01:48:23.567   notice : [mariadbmon] (handle_auto_failover): Failover 'server1' -> 'server3' performed.
2024-10-11 01:48:39.352   notice : (log_state_change): Server changed state: server1[mariadb-sts-0.mariadb-service.kaizen.svc.cluster.local:3306]: server_up. [Down] -> [Running]
2024-10-11 01:48:40.537   warning: [mariadbmon] (update_master): 'server1' is a better primary candidate than the current primary 'server3'. Primary will change when 'server3' is no longer a valid primary.
2024-10-11 01:48:40.537   notice : (log_state_change): Server changed state: server2[mariadb-sts-1.mariadb-service.kaizen.svc.cluster.local:3306]: server_up. [Down] -> [Running]
2024-10-11 01:48:40.932   warning: [mariadbmon] (update_master): 'server1' is a better primary candidate than the current primary 'server3'. Primary will change when 'server3' is no longer a valid primary.
2024-10-11 01:49:22.135   notice : (update_addr_info): Server server1 hostname 'mariadb-sts-0.mariadb-service.kaizen.svc.cluster.local' resolved to 10.42.203.39.
2024-10-11 01:49:22.136   notice : (update_addr_info): Server server2 hostname 'mariadb-sts-1.mariadb-service.kaizen.svc.cluster.local' resolved to 10.42.213.42.
- **MaxScale Version**: [24.02.3] - **MariaDB Version**: [11.5.2-MariaDB-ubu2404-log] Thanks in advance for any suggestions or guidance! ---
Asked by unknown (104 rep)
Oct 11, 2024, 02:03 AM