What could cause replication to break on MySQL when replicating from 5.7.40 to 8.0 replica
0
votes
1
answer
231
views
We have a MySQL primary running on 5.7.40. We are in process of testing the upgrade to 8.0.36 (this is on AWS RDS). We had 2 MySQL-8 replicas and 2 MySQL-5.7 replicas replicating from the primary.
At almost the same time, both the MySQL 8.0 replicas stopped replicating and complained on
HA_ERR_FOUND_DUPP_KEY
Replica SQL for channel '': Worker 2 failed executing transaction '87953f5d-7595-11ed-830d-02f4790d85ab:57805008598' at source log mysql-bin-changelog.676514, end_log_pos 52858646; Could not execute Write_rows event on table ebdb.bike_issues; Duplicate entry '177235118' for key 'bike_issues.PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's source log mysql-bin-changelog.676514, end_log_pos 52858646, Error_code: MY-001062
Weirdly though the replicas complained on different key values, but the timestamp was nearly the same (few ms apart). The MySQL 5.7 replicas were fine, so clearly there was no hiccups on the primary side. Nothing shows up in the logs of primary around this time either.
The table, it complained on, is very commonly written table and we had this MySQL 8 replicas running for over a week now, without any replication issues.
We do Row-based, GTID based replication (gtid_mode ON, enforce_gtid_consistency ON)
I was able to resume replication by setting slave_exec_mode
to IDEMPOTENT
temporarily. When I check the error logs after the replication was in sync, I didn't see errors for the first replica's key in the second replicas error logs & vice versa i.e. they both failed on different keys. Could this be some issue on the replication receiver part? Or possibly some bug due to version mismatch? Or some mysql variable mismatch?
How can I debug this further? What could have possibly caused this blip?
**EDIT #1**
Output of create table
On Primary
Create Table: CREATE TABLE bike_issues
(
id
int(11) NOT NULL AUTO_INCREMENT,
issue_type
varchar(191) COLLATE utf8mb4_bin NOT NULL,
bike_id
int(11) NOT NULL,
reported_at
datetime NOT NULL,
resolved_at
datetime DEFAULT NULL,
resolution_type
varchar(191) COLLATE utf8mb4_bin DEFAULT NULL,
created_at
datetime NOT NULL,
updated_at
datetime NOT NULL,
token
varchar(191) COLLATE utf8mb4_bin DEFAULT NULL,
idempotency_key
varchar(191) COLLATE utf8mb4_bin DEFAULT NULL,
PRIMARY KEY (id
),
UNIQUE KEY index_bike_issues_on_token
(token
),
UNIQUE KEY index_bike_issues_on_idempotency_key
(idempotency_key
),
KEY by_bike_id_issue_type_resolved_at
(bike_id
,issue_type
,resolved_at
) USING BTREE,
KEY index_bike_issues_on_resolved_at
(resolved_at
)
) ENGINE=InnoDB AUTO_INCREMENT=177614144 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin
On MySQL 8 replica
Create Table: CREATE TABLE bike_issues
(
id
int NOT NULL AUTO_INCREMENT,
issue_type
varchar(191) COLLATE utf8mb4_bin NOT NULL,
bike_id
int NOT NULL,
reported_at
datetime NOT NULL,
resolved_at
datetime DEFAULT NULL,
resolution_type
varchar(191) COLLATE utf8mb4_bin DEFAULT NULL,
created_at
datetime NOT NULL,
updated_at
datetime NOT NULL,
token
varchar(191) COLLATE utf8mb4_bin DEFAULT NULL,
idempotency_key
varchar(191) COLLATE utf8mb4_bin DEFAULT NULL,
PRIMARY KEY (id
),
UNIQUE KEY index_bike_issues_on_token
(token
),
UNIQUE KEY index_bike_issues_on_idempotency_key
(idempotency_key
),
KEY by_bike_id_issue_type_resolved_at
(bike_id
,issue_type
,resolved_at
) USING BTREE,
KEY index_bike_issues_on_resolved_at
(resolved_at
)
) ENGINE=InnoDB AUTO_INCREMENT=177614497 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin
On Primary
show variables like '%binlog%';
+--------------------------------------------+----------------------+
| Variable_name | Value |
+--------------------------------------------+----------------------+
| binlog_cache_size | 32768 |
| binlog_checksum | NONE |
| binlog_direct_non_transactional_updates | OFF |
| binlog_error_action | IGNORE_ERROR |
| binlog_format | ROW |
| binlog_group_commit_sync_delay | 0 |
| binlog_group_commit_sync_no_delay_count | 0 |
| binlog_gtid_simple_recovery | ON |
| binlog_max_flush_queue_time | 0 |
| binlog_order_commits | ON |
| binlog_row_image | FULL |
| binlog_rows_query_log_events | OFF |
| binlog_stmt_cache_size | 32768 |
| binlog_transaction_dependency_history_size | 25000 |
| binlog_transaction_dependency_tracking | COMMIT_ORDER |
| innodb_api_enable_binlog | OFF |
| innodb_locks_unsafe_for_binlog | OFF |
| log_statements_unsafe_for_binlog | OFF |
| max_binlog_cache_size | 18446744073709547520 |
| max_binlog_size | 134217728 |
| max_binlog_stmt_cache_size | 18446744073709547520 |
| sync_binlog | 2000 |
+--------------------------------------------+----------------------+
mysql> show variables like '%slave%';
+------------------------------+-----------------------+
| Variable_name | Value |
+------------------------------+-----------------------+
| init_slave | |
| log_slave_updates | ON |
| log_slow_slave_statements | ON |
| pseudo_slave_mode | OFF |
| rpl_stop_slave_timeout | 31536000 |
| slave_allow_batching | OFF |
| slave_checkpoint_group | 512 |
| slave_checkpoint_period | 300 |
| slave_compressed_protocol | OFF |
| slave_exec_mode | STRICT |
| slave_load_tmpdir | /rdsdbdata/tmp |
| slave_max_allowed_packet | 1073741824 |
| slave_net_timeout | 60 |
| slave_parallel_type | DATABASE |
| slave_parallel_workers | 0 |
| slave_pending_jobs_size_max | 16777216 |
| slave_preserve_commit_order | OFF |
| slave_rows_search_algorithms | TABLE_SCAN,INDEX_SCAN |
| slave_skip_errors | OFF |
| slave_sql_verify_checksum | ON |
| slave_transaction_retries | 10 |
| slave_type_conversions | |
| sql_slave_skip_counter | 0 |
+------------------------------+-----------------------+
mysql> show variables where variable_name in ('gtid_mode', 'enforce_gtid_consistency', 'innodb_flush_log_at_trx_commit');
+--------------------------------+-------+
| Variable_name | Value |
+--------------------------------+-------+
| enforce_gtid_consistency | ON |
| gtid_mode | ON |
| innodb_flush_log_at_trx_commit | 2 |
+--------------------------------+-------+
On MySQL 8.0 replica
mysql> show variables like '%binlog%';
+------------------------------------------------+----------------------+
| Variable_name | Value |
+------------------------------------------------+----------------------+
| binlog_cache_size | 32768 |
| binlog_checksum | NONE |
| binlog_direct_non_transactional_updates | OFF |
| binlog_encryption | OFF |
| binlog_error_action | ABORT_SERVER |
| binlog_expire_logs_auto_purge | ON |
| binlog_expire_logs_seconds | 2592000 |
| binlog_format | ROW |
| binlog_group_commit_sync_delay | 0 |
| binlog_group_commit_sync_no_delay_count | 0 |
| binlog_gtid_simple_recovery | ON |
| binlog_max_flush_queue_time | 0 |
| binlog_order_commits | ON |
| binlog_rotate_encryption_master_key_at_startup | OFF |
| binlog_row_event_max_size | 8192 |
| binlog_row_image | FULL |
| binlog_row_metadata | MINIMAL |
| binlog_row_value_options | |
| binlog_rows_query_log_events | OFF |
| binlog_stmt_cache_size | 32768 |
| binlog_transaction_compression | OFF |
| binlog_transaction_compression_level_zstd | 3 |
| binlog_transaction_dependency_history_size | 25000 |
| binlog_transaction_dependency_tracking | COMMIT_ORDER |
| innodb_api_enable_binlog | OFF |
| log_statements_unsafe_for_binlog | OFF |
| max_binlog_cache_size | 18446744073709547520 |
| max_binlog_size | 134217728 |
| max_binlog_stmt_cache_size | 18446744073709547520 |
| sync_binlog | 1000 |
+------------------------------------------------+----------------------+
mysql> show variables where variable_name in ('gtid_mode', 'enforce_gtid_consistency', 'innodb_flush_log_at_trx_commit');
+--------------------------------+-------+
| Variable_name | Value |
+--------------------------------+-------+
| enforce_gtid_consistency | ON |
| gtid_mode | ON |
| innodb_flush_log_at_trx_commit | 2 |
+--------------------------------+-------+
mysql> show variables like '%replica%';
+-----------------------------------------------+----------------+
| Variable_name | Value |
+-----------------------------------------------+----------------+
| group_replication_consistency | EVENTUAL |
| init_replica | |
| innodb_replication_delay | 0 |
| log_replica_updates | ON |
| log_slow_replica_statements | OFF |
| pseudo_replica_mode | OFF |
| replica_allow_batching | ON |
| replica_checkpoint_group | 512 |
| replica_checkpoint_period | 300 |
| replica_compressed_protocol | OFF |
| replica_exec_mode | STRICT |
| replica_load_tmpdir | /rdsdbdata/tmp |
| replica_max_allowed_packet | 1073741824 |
| replica_net_timeout | 60 |
| replica_parallel_type | LOGICAL_CLOCK |
| replica_parallel_workers | 32 |
| replica_pending_jobs_size_max | 134217728 |
| replica_preserve_commit_order | ON |
| replica_skip_errors | OFF |
| replica_sql_verify_checksum | ON |
| replica_transaction_retries | 10 |
| replica_type_conversions | |
| replication_optimize_for_static_plugin_config | OFF |
| replication_sender_observe_commit_only | OFF |
| rpl_stop_replica_timeout | 31536000 |
| skip_replica_start | ON |
| sql_replica_skip_counter | 0 |
+-----------------------------------------------+----------------+
Asked by Rahul
(119 rep)
Mar 16, 2024, 01:49 AM
Last activity: Mar 22, 2024, 06:00 AM
Last activity: Mar 22, 2024, 06:00 AM