MySQL multi-master group replication on kubernetes
2
votes
0
answers
1124
views
We are trying to setup MySQL multi-master group replication (GR) on kubernetes Group replication configuring instances .
GR is starting on one pod after all the configurations. However the second node goes to RECOVERING state when GR is started followed by ERROR state.
There is no error in GCS_DEBUG_TRACE logs also.
Let me know if there is anything missing and if more info is required to analyze. Thanks in advance.
Workarounds tried:
1. https://dba.stackexchange.com/questions/268801/mysql-group-replication-multi-primary-setup
2. https://stackoverflow.com/questions/50794695/mysql-group-replication-stuck-on-recovering-forever
Cluster Setup:
1. Created 3 PVCs for each pods in a namespace
2. Launched pods using mysql:8.0.23 docker image (https://hub.docker.com/_/mysql )
3. Ran below queries to configure the pods
$ kubectl get all -n myb5
NAME READY STATUS RESTARTS AGE
pod/mysql1 1/1 Running 0 15h
pod/mysql2 1/1 Running 0 15h
pod/mysql3 1/1 Running 0 15h
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/gr-domain ClusterIP None 15h
$ kubectl get pvc -n myb5
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
mysql-pv-claim-1 Bound pvc-f7957eff-b75e-4dbc-990a-8d79e54b6f06 250Gi RWO robin 15h
mysql-pv-claim-2 Bound pvc-1c5d4dfd-8495-4266-af0d-882ce8e8ccec 250Gi RWO robin 15h
mysql-pv-claim-3 Bound pvc-49c0979b-49cb-413b-b695-479b32124343 250Gi RWO robin 15h
Configuration on all pods:
SET PERSIST general_log = ON;
SET PERSIST general_log_file= '/var/lib/mysql/mysql1.log';
SET PERSIST group_replication_communication_debug_options='GCS_DEBUG_ALL';
SET PERSIST enforce_gtid_consistency=ON;
SET PERSIST gtid_mode = OFF_PERMISSIVE;
SET PERSIST gtid_mode = ON_PERMISSIVE;
SET PERSIST gtid_mode = ON;
SET PERSIST binlog_format = ROW;
SET PERSIST master_info_repository='TABLE';
SET PERSIST relay_log_info_repository='TABLE';
SET PERSIST transaction_write_set_extraction=XXHASH64;
SET SQL_LOG_BIN = 0;
CREATE USER rpl_user@'%' IDENTIFIED BY 'password';
GRANT REPLICATION SLAVE ON *.* TO rpl_user@'%';
GRANT BACKUP_ADMIN ON *.* TO rpl_user@'%';
FLUSH PRIVILEGES;
CHANGE REPLICATION SOURCE TO SOURCE_USER='rpl_user', SOURCE_PASSWORD='password' FOR CHANNEL 'group_replication_recovery';
SET SQL_LOG_BIN = 1;
INSTALL PLUGIN group_replication SONAME 'group_replication.so';
SET PERSIST group_replication_group_name='85cbd4a0-7338-46f1-b15e-28c1a26f465e';
SET PERSIST group_replication_start_on_boot=OFF;
SET PERSIST group_replication_bootstrap_group=OFF;
SET PERSIST group_replication_single_primary_mode=OFF;
SET PERSIST group_replication_enforce_update_everywhere_checks=ON;
SET PERSIST group_replication_member_expel_timeout=3600;
SET PERSIST group_replication_group_seeds='mysql1.gr-domain.myb5.svc.cluster.local:33061,mysql2.gr-domain.myb5.svc.cluster.local:33061,mysql3.gr-domain.myb5.svc.cluster.local:33061';
SET PERSIST group_replication_ip_allowlist='mysql1.gr-domain.myb5.svc.cluster.local,mysql2.gr-domain.myb5.svc.cluster.local,mysql3.gr-domain.myb5.svc.cluster.local';
Conf on pod1:
SET PERSIST server_id=1;
SET PERSIST group_replication_local_address= 'mysql1.gr-domain.myb5.svc.cluster.local:33061';
SET PERSIST group_replication_bootstrap_group=ON;
START GROUP_REPLICATION USER='rpl_user', PASSWORD='password';
SET PERSIST group_replication_bootstrap_group=OFF;
SET PERSIST group_replication_recovery_get_public_key=ON;
Conf on pod2:
SET PERSIST server_id=2;
SET PERSIST group_replication_local_address= 'mysql2.gr-domain.myb5.svc.cluster.local:33061';
START GROUP_REPLICATION USER='rpl_user', PASSWORD='password';
Conf on pod3:
SET PERSIST server_id=3;
SET PERSIST group_replication_local_address= 'mysql3.gr-domain.myb5.svc.cluster.local:33061';
START GROUP_REPLICATION USER='rpl_user', PASSWORD='password';
Group Replication Status when started:
mysql> SELECT * FROM performance_schema.replication_group_members\G
*************************** 1. row ***************************
CHANNEL_NAME: group_replication_applier
MEMBER_ID: 57b9f42a-8b4d-11eb-bd3e-0242ac110003
MEMBER_HOST: mysql1
MEMBER_PORT: 3306
MEMBER_STATE: ONLINE
MEMBER_ROLE: PRIMARY
MEMBER_VERSION: 8.0.23
*************************** 2. row ***************************
CHANNEL_NAME: group_replication_applier
MEMBER_ID: 57ec4b94-8b4d-11eb-8fdc-0242ac110004
MEMBER_HOST: mysql2
MEMBER_PORT: 3306
MEMBER_STATE: RECOVERING
MEMBER_ROLE: PRIMARY
MEMBER_VERSION: 8.0.23
*************************** 3. row ***************************
CHANNEL_NAME: group_replication_applier
MEMBER_ID: 57ec4b94-8b4d-11eb-8fdc-0242ac110005
MEMBER_HOST: mysql3
MEMBER_PORT: 3306
MEMBER_STATE: RECOVERING
MEMBER_ROLE: PRIMARY
MEMBER_VERSION: 8.0.23
Group Replication Error after few minutes:
mysql> SELECT * FROM performance_schema.replication_group_members\G
*************************** 1. row ***************************
CHANNEL_NAME: group_replication_applier
MEMBER_ID: 57ec4b94-8b4d-11eb-8fdc-0242ac110004
MEMBER_HOST: mysql2
MEMBER_PORT: 3306
MEMBER_STATE: ERROR
MEMBER_ROLE:
MEMBER_VERSION: 8.0.23
Asked by Raghavendra V
(21 rep)
Mar 24, 2021, 11:54 PM
Last activity: Mar 25, 2021, 05:41 AM
Last activity: Mar 25, 2021, 05:41 AM