Issue with MariaDB Master-Slave Cluster Failover in MaxScale When Multiple Pods Are Deleted
0
votes
0
answers
63
views
**Issue with MariaDB Master-Slave Cluster Failover in MaxScale When Multiple Pods Are Deleted**
Hello everyone,
I’m running a MariaDB master-slave-slave setup with MaxScale for automatic failover. The cluster consists of one master and two slaves.
Everything works as expected under normal conditions, but I encountered a problem when I deleted two of the MariaDB pods simultaneously (e.g.,
My question is:
- How can I configure MaxScale to handle this scenario better?
- Specifically, I want the third server to reliably take over as the primary when the first two are deleted, and ensure it remains the primary when they come back online, until manual intervention or an automatic rebalancing.
Here are the relevant logs from MaxScale and my current configuration details:

mariadb-sts-0
and mariadb-sts-1
).
MaxScale initially promotes the remaining slave (server3
) to the primary role, but once the other two pods come back online, the cluster ends up without a primary. It seems MaxScale is trying to re-promote one of the original servers as the primary, but the process doesn't complete properly.

2024-10-11 01:48:22.378 error : (log_connect_error): Monitor was unable to connect to server server1[mariadb-sts-0.mariadb-service.kaizen.svc.cluster.local:3306] : 'Unknown server host 'mariadb-sts-0.mariadb-service.kaizen.svc.cluster.local' (-2)'
2024-10-11 01:48:22.382 notice : (log_state_change): Server changed state: server1[mariadb-sts-0.mariadb-service.kaizen.svc.cluster.local:3306]: master_down. [Master, Running] -> [Down]
2024-10-11 01:48:22.382 warning: [mariadbmon] (handle_auto_failover): Primary has failed. If primary does not return in 1 monitor tick(s), failover begins.
2024-10-11 01:48:23.385 error : (log_connect_error): Monitor was unable to connect to server server2[mariadb-sts-1.mariadb-service.kaizen.svc.cluster.local:3306] : 'Unknown server host 'mariadb-sts-1.mariadb-service.kaizen.svc.cluster.local' (-2)'
2024-10-11 01:48:23.385 notice : (log_state_change): Server changed state: server2[mariadb-sts-1.mariadb-service.kaizen.svc.cluster.local:3306]: slave_down. [Slave, Running] -> [Down]
2024-10-11 01:48:23.385 notice : [mariadbmon] (select_promotion_target): Selecting a server to promote and replace 'server1'. Candidates are: 'server2', 'server3'.
2024-10-11 01:48:23.386 warning: [mariadbmon] (select_promotion_target): Some servers were disqualified for promotion:\n'server2' cannot be selected because it is down or in maintenance.
2024-10-11 01:48:23.386 notice : [mariadbmon] (select_promotion_target): Selected 'server3'.
2024-10-11 01:48:23.386 notice : [mariadbmon] (handle_auto_failover): Performing automatic failover to replace failed primary 'server1'.
2024-10-11 01:48:23.567 notice : [mariadbmon] (handle_auto_failover): Failover 'server1' -> 'server3' performed.
2024-10-11 01:48:39.352 notice : (log_state_change): Server changed state: server1[mariadb-sts-0.mariadb-service.kaizen.svc.cluster.local:3306]: server_up. [Down] -> [Running]
2024-10-11 01:48:40.537 warning: [mariadbmon] (update_master): 'server1' is a better primary candidate than the current primary 'server3'. Primary will change when 'server3' is no longer a valid primary.
2024-10-11 01:48:40.537 notice : (log_state_change): Server changed state: server2[mariadb-sts-1.mariadb-service.kaizen.svc.cluster.local:3306]: server_up. [Down] -> [Running]
2024-10-11 01:48:40.932 warning: [mariadbmon] (update_master): 'server1' is a better primary candidate than the current primary 'server3'. Primary will change when 'server3' is no longer a valid primary.
2024-10-11 01:49:22.135 notice : (update_addr_info): Server server1 hostname 'mariadb-sts-0.mariadb-service.kaizen.svc.cluster.local' resolved to 10.42.203.39.
2024-10-11 01:49:22.136 notice : (update_addr_info): Server server2 hostname 'mariadb-sts-1.mariadb-service.kaizen.svc.cluster.local' resolved to 10.42.213.42.
- **MaxScale Version**: [24.02.3]
- **MariaDB Version**: [11.5.2-MariaDB-ubu2404-log]
Thanks in advance for any suggestions or guidance!
---
Asked by unknown
(104 rep)
Oct 11, 2024, 02:03 AM