Database Administrators

Q&A for database professionals who wish to improve their database skills

Latest Questions

2 votes

0 answers

56 views

MariaDB master feeds slave's Slave_IO with a low rate

mariadb master-slave-replication mariadb-11

I have a pair of identical servers: Ubuntu 22.04 LTS with MariaDB 11.7.2. 32 CPU cores, 128 GB RAM, 2x2TB NVME, i210 gigabit ethernet A classic master-slave replication (logfile/logpos) configured. Everything work pretty fine except slave slowly lagging behind master. Slave is in the state: ``` Slav...

Slave_SQL_State: Slave has read all relay log; waiting for more updates
       Slave_IO_State: Waiting for master to send event
      Master_Log_File: log-bin.062358
  Read_Master_Log_Pos: 43451522
    . . . . 
Seconds_Behind_Master: 0

Master is in the state:

1201054 repl 1.2.3.4:45678  NULL  Binlog Dump  81402  Writing to net  NULL  0.000

The problem is that the newest master binlog file is log-bin.069669 - 7200+ chunks ahead, 100MB each. So slave is 700GB+ behind the master. There are LOT of updates on the master, approx 500MB of binlog per minute. 500MB/m = 4000Mb/m = 70Mb/s that is way lower than available 1Gb/s AVG load on the master is quite low. Ping between servers is 25ms. I have changed binlog_row_image variable from FULL to MINIMAL (an advice from here - https://dba.stackexchange.com/a/289768/7895 ) with no visible effect. The only mystic symptom is that slave shows zero seconds behind master most of time and sometimes i'm lucky enough to see a real lag with show slave status\G. Has anyone encountered a similar problem? What was the cause, and how did you overcome it? UPDATE ------ Master:

MariaDB [(none)]> show master status;
+----------------+----------+-----------------+------------------+
| File           | Position | Binlog_Do_DB    | Binlog_Ignore_DB |
+----------------+----------+-----------------+------------------+
| log-bin.073714 | 96274193 | aaa,bbb,ccc,ddd |                  |
+----------------+----------+-----------------+------------------+
1 row in set (0.000 sec)

Slave:

MariaDB [(none)]> show all slaves status \G
*************************** 1. row ***************************
               Connection_name:
               Slave_SQL_State: Slave has read all relay log; waiting for more updates
                Slave_IO_State: Waiting for master to send event
                   Master_Host: 1.2.3.4
                   Master_User: repl
                   Master_Port: 3306
                 Connect_Retry: 60
               Master_Log_File: log-bin.065969
           Read_Master_Log_Pos: 65381018
                Relay_Log_File: relay-bin.012836
                 Relay_Log_Pos: 6879217
         Relay_Master_Log_File: log-bin.065969
              Slave_IO_Running: Yes
             Slave_SQL_Running: Yes
               Replicate_Do_DB:
           Replicate_Ignore_DB:
            Replicate_Do_Table:
        Replicate_Ignore_Table:
       Replicate_Wild_Do_Table:
   Replicate_Wild_Ignore_Table:
                    Last_Errno: 0
                    Last_Error:
                  Skip_Counter: 0
           Exec_Master_Log_Pos: 65345606
               Relay_Log_Space: 18336490
               Until_Condition: None
                Until_Log_File:
                 Until_Log_Pos: 0
            Master_SSL_Allowed: Yes
            Master_SSL_CA_File:
            Master_SSL_CA_Path:
               Master_SSL_Cert:
             Master_SSL_Cipher:
                Master_SSL_Key:
         Seconds_Behind_Master: 0
 Master_SSL_Verify_Server_Cert: Yes
                 Last_IO_Errno: 0
                 Last_IO_Error:
                Last_SQL_Errno: 0
                Last_SQL_Error:
   Replicate_Ignore_Server_Ids:
              Master_Server_Id: 7
                Master_SSL_Crl:
            Master_SSL_Crlpath:
                    Using_Gtid: Slave_Pos
                   Gtid_IO_Pos: 0-7-1737276444
       Replicate_Do_Domain_Ids:
   Replicate_Ignore_Domain_Ids:
                 Parallel_Mode: conservative
                     SQL_Delay: 0
           SQL_Remaining_Delay: NULL
       Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
              Slave_DDL_Groups: 1
Slave_Non_Transactional_Groups: 0
    Slave_Transactional_Groups: 33122121
          Replicate_Rewrite_DB:
          Retried_transactions: 1
            Max_relay_log_size: 1073741824
          Executed_log_entries: 142786513
     Slave_received_heartbeats: 0
        Slave_heartbeat_period: 30.000
                Gtid_Slave_Pos: 0-7-1737276444
        Master_last_event_time: 2025-06-06 19:43:36
         Slave_last_event_time: 2025-06-06 19:43:36
        Master_Slave_time_diff: 0
1 row in set (0.001 sec)

MariaDB [(none)]> show global variables like '%parallel%';
+-------------------------------+--------------+
| Variable_name                 | Value        |
+-------------------------------+--------------+
| slave_domain_parallel_threads | 0            |
| slave_parallel_max_queued     | 131072       |
| slave_parallel_mode           | conservative |
| slave_parallel_threads        | 4            |
| slave_parallel_workers        | 4            |
+-------------------------------+--------------+
5 rows in set (0.001 sec)

UPDATE #2 --------- When I run stop slave; start slave; the lag become decreasing for some time but then everything turned back. The steady lag decreasing was achieved when replication been restarted every minute or so.

I think this proves that the VPN is the cause of the problem. I'm not a VPN guy, can anyone at least give me some keywords to google? UPDATE #3 --------- I have started netcat listener on the slave host: nc -l 23456 | pv -r -b > /dev/null and a feeder on a master side: cat /mnt/repl/log-bin.*[0-9] | pv -L 10M -s 100G -S -r -b | nc -4 1.2.3.4 23456 I got a stable 10MB/s stream so I'm confident now it isn't a networking issue.

Kondybas (4800 rep)

Jun 7, 2025, 10:26 AM • Last activity: Jun 12, 2025, 07:27 AM

1 votes

2 answers

129 views

MariaDB 11.7.2 do not rotate/purge binlog chunks (SOLVED)

replication mariadb mariadb-11

MariaDB 11.7.2 is configured to keep binlog for 10800 seconds. `RESET MASTER;` performed. No slaves attached. Not a part of Galera. And no binlog rotation occured. Binlog chunks created but not purged. On attempt to purge an old chunks manually I got the next: MariaDB [(none)]> purge master logs to...

                                  MariaDB 11.7.2 is configured to keep binlog for 10800 seconds. RESET MASTER; performed. No slaves attached. Not a part of Galera. And no binlog rotation occured. Binlog chunks created but not purged. On attempt to purge an old chunks manually I got the next:   

    MariaDB [(none)]> purge master logs to 'log-bin.000003';
    Query OK, 0 rows affected, 1 warning (0.002 sec)
    MariaDB [(none)]> show warnings;
    +-------+------+-----------------------------------------------------------------------------------+
    | Level | Code | Message                                                                           
    +-------+------+-----------------------------------------------------------------------------------+
    | Note  | 1375 | Binary log 'log-bin.000001' is not purged because it is the current active binlog |
    +-------+------+-----------------------------------------------------------------------------------+
    1 row in set (0.000 sec)

With current rate of operation disk will be overflown in a 48 hours. Periodic RESET MASTER; purges binlog completely but replication isn't possible under that circumstances.

I have a little experience with MariaDB 11.x, maybe I have missed something important in my.cnf inherited from 10.x version.

Something similar was found as a fixed bug: https://lists.mariadb.org/hyperkitty/list/commits@lists.mariadb.org/thread/2FXXPRGD7FMXEPPMVE763O7N6HDUANAB/   

Any hints, suggestions, advices or even hypothesis are welcome. 

----------
----------
----------

UPDATE  
From MariaDB 11.4 a new variable has been introduced:   

     slave_connections_needed_for_purge
If set to ON (default) binlog rotation will be disabled until SLAVE_IO connection will be established. binlog_expire_logs_seconds is completely ignored. 
                                

Kondybas (4800 rep)

May 16, 2025, 03:57 PM • Last activity: May 28, 2025, 05:46 PM

Showing page 1 of 2 total questions