Cluster MariaDB Primary + Two Replicas orchestrated via MaxScale: how to recover from a major disruption?
0
votes
0
answers
76
views
I am working on a Mariadb cluster primary/replica handled via maxscale, to which I successfully added a third node. When I left things few days ago everything was fine. today I found that all nodes were down, supposedly because of a power outage. This is a lab environment, so nobody really cared to check before rebooting the hardware.
Because it is a lab environment, I could easily restart from the last point, but since I'm here, I'd like to take the chance and learn something.
I managed to reboot all nodes, except one which is not getting back to the cluster.
This are my variables right now:
mariadb221 [(none)]> show global variables like '%gtid%';
Variable_name Value
gtid_binlog_pos 1-3000-1359255
gtid_binlog_state 1-2000-358368,1-1000-1359225,1-3000-1359255
gtid_cleanup_batch_size 64
gtid_current_pos 1-3000-1359255
gtid_domain_id 1
gtid_ignore_duplicates OFF
gtid_pos_auto_engines
gtid_slave_pos 1-3000-1359255
gtid_strict_mode ON
wsrep_gtid_domain_id 0
wsrep_gtid_mode OFF
mariadb222 [(none)]> show global variables like '%gtid%';
Variable_name Value
gtid_binlog_pos 1-1000-1359231
gtid_binlog_state 1-2000-358368,1-3000-359229,1-1000-1359231
gtid_cleanup_batch_size 64
gtid_current_pos 110001359254
gtid_domain_id 1
gtid_ignore_duplicates OFF
gtid_pos_auto_engines
gtid_slave_pos 110001359254
gtid_strict_mode ON
wsrep_gtid_domain_id 0
wsrep_gtid_mode OFF
mariadb223 [(none)]> show global variables like '%gtid%';
Variable_name Value
gtid_binlog_pos 1-3000-1359255
gtid_binlog_state 1-1000-1359230,1-2000-358368,1-3000-1359255
gtid_cleanup_batch_size 64
gtid_current_pos 1-3000-1359255
gtid_domain_id 1
gtid_ignore_duplicates OFF
gtid_pos_auto_engines
gtid_slave_pos 1-3000-1359255
gtid_strict_mode ON
wsrep_gtid_domain_id 0
wsrep_gtid_mode OFF
Aside from deleting the bad node and recreating it like it was a new one (backup of the remaining slave + restore of that backup onto it + adding it to the cluster again), is there anything I can do to recover from this position?
Any suggestion of anything else I should check will also be appreciated.
P.S.: the bad node is the second one
Asked by albea798
(1 rep)
Oct 20, 2024, 09:57 AM
Last activity: Oct 23, 2024, 11:06 AM
Last activity: Oct 23, 2024, 11:06 AM