Sample Header Ad - 728x90

Resync mysql replica in a GTID setup gives error HA_ERR_FOUND_DUPP_KEY with AUTOPOSITION

0 votes
0 answers
44 views
I use a procedure based on this page for years to sync a new replica or resync a broken replica for year without issue. This uses xtrabackup to save and send the mysql data to the slave, apply the redo log, and restart the slave with MASTER_AUTO_POSITION. Simple and effective. xtrabackup is convenient for us, due to the size of the databases we have to transfer. we use this way xtrabackup --backup --stream=xbstream --parallel=$(NB_PROC/2) mysql: 5.7.42 xtrabackup: 2.4.29 lately when I start the slave I get such error
Last_Errno: 1062
                   Last_Error: Could not execute Write_rows event on table db_2.webhook_message_status; Duplicate entry '3304591' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mysql-bin.001615, end_log_pos 5021
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 4692
              Relay_Log_Space: 143514642
...
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 1062
               Last_SQL_Error: Could not execute Write_rows event on table db_2.webhook_message_status; Duplicate entry '3304591' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mysql-bin.001615, end_log_pos 5021
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 189087
                  Master_UUID: 144c0164-3223-11ef-8319-74563c5c838d
             Master_Info_File: mysql.slave_master_info
Looking on slave I confirm the entry with id 3304591 is already there
+---------+---------------------+---------+--------+-----------------------------+---------------------+
| id      | message_external_id | site_id | status | context                     | created_at          |
+---------+---------------------+---------+--------+-----------------------------+---------------------+
| 3304591 | xxxxxxxxxxxxxxxxxxx |     483 | read   | {"code":200,"title":"read"} | 2024-09-19 07:52:00 |
+---------+---------------------+---------+--------+-----------------------------+---------------------+
so it seems the slave does not know anymore to properly position itself. As the workaround I get the content of xtrabackup_binlog_info
mysql-bin.001615        73610932        144c0164-3223-11ef-8319-74563c5c838d:1-14071690
and I did this
mysql> reset master
mysql> set global GTID_PURGED="144c0164-3223-11ef-8319-74563c5c838d:1-14071690"
mysql> start slave
and now it works
Master_Server_Id: 189087
                  Master_UUID: 144c0164-3223-11ef-8319-74563c5c838d
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
...
           Retrieved_Gtid_Set: 144c0164-3223-11ef-8319-74563c5c838d:14047201-14248907
            Executed_Gtid_Set: 144c0164-3223-11ef-8319-74563c5c838d:1-14248907
                Auto_Position: 1
as an alternative solution I use pt-slave-restart to bypass all error id 1062, so after a while, the slave eventually is in sync. Do you have an idea about what could be the cause of this problem? It used to work fine, during the last 3 years we use without an issue. We did not changed major version of MySQL, or anytool involved. is the "workaround" I'm doing is fine ? Do I have all master data on slave ? best
Asked by Baptiste Mille-Mathias (130 rep)
Sep 19, 2024, 09:55 AM
Last activity: Sep 25, 2024, 09:37 AM