Database Administrators

Q&A for database professionals who wish to improve their database skills

Latest Questions

0 votes

1 answers

37 views

(SOLVED) Source and Replica in a MariaDB 'pair' do not start exchanging data

On Linux Slackware PCs, I have a 'Source' database on one PC, and a 'Replica' on the other. On the Source end (11.4.7-MariaDB) I've programmed ID=1. On the Replica end (10.11.6-MariaDB), I set ID=11 and set the Source IP. I've started them, and they seem to like each other, as the Show replica status reports everything as Ok, no errors, comms ok, etc... BUT, when I create a new table on the 'Source' machine, nothing happens on the 'Replica' - no new table is created. And, after I manually create the table on the Replica, and I insert a record on the Source, no data transfer happens either. No pointers are incremented, seemingly no data is transferred. And no errors seem to be appearing in the logs. Here are the status reports for Source and Replica. I dispair, I consulted AI, and after several sessions, AI was in the same state as I was... repeatedly asking me to insert new records at the Source and checking at the other end, maybe hoping something would repair itself. Also suggesting extra complications such as adding SSH, Gtid and other optionals. What can I do to get the show on the road? I've reinstalled MariaDB at the Replica. I'm somewhat reticent in changing the DB at the Source, as it's in use. I've checked permissions on files and directories... Last session ended at 04:30 AM this morning. **source** status:

show master status\G
*************************** 1. row ***************************
            File: mysql-bin.000020
        Position: 328
    Binlog_Do_DB: homeorptemp, farma1, test_replication
Binlog_Ignore_DB: 
1 row in set (0.000 sec)

**replica** status:

slave status\G
*************************** 1. row ***************************
                Slave_IO_State: Waiting for master to send event
                   Master_Host: 192.168.1.102
                   Master_User: my_repl
                   Master_Port: 3306
                 Connect_Retry: 60
               Master_Log_File: mysql-bin.000020
           Read_Master_Log_Pos: 328
                Relay_Log_File: mysql-relay-bin.000007
                 Relay_Log_Pos: 627
         Relay_Master_Log_File: mysql-bin.000020
              Slave_IO_Running: Yes
             Slave_SQL_Running: Yes
               Replicate_Do_DB: 
           Replicate_Ignore_DB: 
            Replicate_Do_Table: 
        Replicate_Ignore_Table: 
       Replicate_Wild_Do_Table: 
   Replicate_Wild_Ignore_Table: 
                    Last_Errno: 0
                    Last_Error: 
                  Skip_Counter: 0
           Exec_Master_Log_Pos: 328
               Relay_Log_Space: 1235
               Until_Condition: None
                Until_Log_File: 
                 Until_Log_Pos: 0
            Master_SSL_Allowed: No
            Master_SSL_CA_File: 
            Master_SSL_CA_Path: 
               Master_SSL_Cert: 
             Master_SSL_Cipher: 
                Master_SSL_Key: 
         Seconds_Behind_Master: 0
 Master_SSL_Verify_Server_Cert: No
                 Last_IO_Errno: 0
                 Last_IO_Error: 
                Last_SQL_Errno: 0
                Last_SQL_Error: 
   Replicate_Ignore_Server_Ids: 
              Master_Server_Id: 1
                Master_SSL_Crl: 
            Master_SSL_Crlpath: 
                    Using_Gtid: No
                   Gtid_IO_Pos: 
       Replicate_Do_Domain_Ids: 
   Replicate_Ignore_Domain_Ids: 
                 Parallel_Mode: optimistic
                     SQL_Delay: 0
           SQL_Remaining_Delay: NULL
       Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
              Slave_DDL_Groups: 0
Slave_Non_Transactional_Groups: 0
    Slave_Transactional_Groups: 0
          Replicate_Rewrite_DB: 
1 row in set (0.001 sec)

**EDIT** my.cnf at the replica:

[client]
[mysqld]
port                    = 3306
socket                  = /var/run/mysql/mysql.sock
skip-external-locking
bind-address            = 0.0.0.0
key_buffer_size         = 256M
max_allowed_packet      = 32M
table_open_cache        = 1024
sort_buffer_size        = 2M
net_buffer_length       = 8K
read_buffer_size        = 512K
read_rnd_buffer_size    = 1M
skip-networking         = OFF
server-id               = 11
log-bin                 = mysql-bin
binlog_format           = mixed
innodb_buffer_pool_size = 512M
innodb_log_file_size    = 256M
read_only               = OFF
log_warnings		= 1
log_slave_updates 	= ON

[mysqldump]
quick
max_allowed_packet     = 32M

[mysql]
no-auto-rehash

[myisamchk]

show processlist;
+----+-------------+-----------+------+-----------+-------+--------------------------------------------------------+------------------+----------+
| Id | User        | Host      | db   | Command   | Time  | State                                                  | Info             | Progress |
+----+-------------+-----------+------+-----------+-------+--------------------------------------------------------+------------------+----------+
|  5 | system user |           | NULL | Slave_IO  | 17725 | Waiting for master to send event                       | NULL             |    0.000 |
|  6 | system user |           | NULL | Slave_SQL | 17724 | Slave has read all relay log; waiting for more updates | NULL             |    0.000 |
|  8 | root        | localhost | NULL | Query     |     0 | starting                                               | show processlist |    0.000 |
+----+-------------+-----------+------+-----------+-------+--------------------------------------------------------+------------------+----------+
3 rows in set (0.000 sec)

jcoppens (101 rep)

Aug 5, 2025, 03:32 PM • Last activity: Aug 6, 2025, 04:55 PM

0 votes

1 answers

160 views

MySQL Master Slave Asynchronous GTID Replication Failover/Switchover over ProxySQL with little downtime

mysql mysql-5.7 failover master-slave-replication proxysql

**This is a production setup and I have to perform a failover from Master A to new potential master B.** **MySQL Version is 5.7 and CentOS 7 is the OS.** I have a simple master slave setup with A as Master and B,C,D,E,F as async slaves. ProxySQL Cluster is also configured. Master A (writer in ProxyS...

                                  **This is a production setup and I have to perform a failover from Master A to new potential master B.**

**MySQL Version is 5.7 and CentOS 7 is the OS.**

I have a simple master slave setup with A as Master and B,C,D,E,F as async slaves. ProxySQL Cluster is also configured.
Master A (writer in ProxySQL) and Slaves B and C (readers in ProxySQL) are added to ProxySQL.

On all the slaves **log_slave_updates** is **ON** and **log_bin** is also **ON**.

On **Master A** **read_only=0** and **Slaves B,C,D,E,F** have **read_only=1**.

This is a **GTID** based replication with **master_auto_position = 0**.

**A --> B,C,D,E,F this is the current replication topology**

Now, I know for failover I have to choose either slave B or slave C because they are on proxy. I choose B as my new potential master. So, I have to move all the slaves from current Master A to new Master C. So the replication topology will be like this

**B --> A,C,D,E,F -- proposed new topology**

**My question is how do I move the slaves from Master A to new potential Master B?**
**How should I use the coordinates from new Master B? What procedure should I follow?**

Once I move the slaves I guess failover will be easy, I just tweak the **read_only** flag on old master and new master and proxy will take care of the connections.

I would really appreciate if a detailed procedure with explanation is provided.

I have gone through following links but couldn't understand the process
https://stackoverflow.com/questions/43960577/mysql-failover-how-to-choose-slave-as-new-master 

https://serverfault.com/questions/303669/how-to-actually-use-mysql-slave-as-soon-the-master-is-failover-or-got-burnt 

https://dev.mysql.com/doc/refman/8.0/en/replication-solutions-switch.html

Avinash Pawar (216 rep)

Aug 3, 2023, 07:15 PM • Last activity: Jul 24, 2025, 01:00 PM

0 votes

1 answers

148 views

How to setup Mysql master-slave replication with the slave meant for testing where divergence is OK

mysql security master-slave-replication testing vulnerability-assessment

Problem: I have a staging DB server on which functionality is tested before pushing it to the production environment. Now, we want to start doing VA/PT (Vulnerability Analysis and Penetration Testing) on our application, but since that can be destructive on the staging DB server, we want to make a s...

                                  Problem:

I have a staging DB server on which functionality is tested before pushing it to the production environment. Now, we want to start doing VA/PT (Vulnerability Analysis and Penetration Testing) on our application, but since that can be destructive on the staging DB server, we want to make a separate Testing environment with a VAPT web+DB server.

Requirements:
1. The data from the staging DB server must be replicated onto the VAPT DB server automatically so that specific new use cases, features, etc can be tested for vulnerabilities.

2. Due to VAPT activities (testing data, SQL Injection, possibly DROP TABLE exploits, etc) the VAPT DB server itself will also have its own data changes, i.e. divergence from Staging DB (Master)

So, if I use simple Master-Slave replication as below I am assured of #1:  
Staging DB (Master) -> VAPT DB (Slave)

But if I do #2, the slave will eventually diverge, which is fine for the testing environment, but, will it interrupt or mess with the Master-Slave replication as per #1?

An obvious example where divergence will cause errors is a VA/PT activity that causes DROP TABLE users so that the Staging DB (Master) users table trying to INSERT/UPDATE data will cause replication errors. Some UPDATEs/DELETEs might cause errors too.  
In particular,  
If I use ROW-based replication divergence will happen quickly causing frequent errors.  
If I use STATEMENT-based replication, since ids will not match, it is possible that some data will break because ids are essential to link data in related tables even though we do not use foreign keys.

Alternatively, instead of replication, I could **manually dump the Staging DB into the VAPT DB daily**, which would be cumbersome to automate.  
OR,  
I could make copy DBs and setup various partial copy operations, but that would complicate matters too much, given that I am not a developer and that my developers often make and revert changes of various sizes randomly.

*EDIT: The data directory varies between 20-25 GB on Staging*

Surely someone has come across this problem in their work so far and there might be a set of best practices for this situation i.e. maintaining a match between staging and testing environments in real-time while allowing testing freedom to play with the data.  
I tried googling for a while but the right phrasing for google escapes me. All I get is howtos for master slave replication, handling unwanted drift/divergence and so on. Nothing much about desired/accepted drift and divergence or partial replication.

Thanks in advance.
                                

site80443 (119 rep)

Apr 30, 2021, 03:13 PM • Last activity: Jul 14, 2025, 12:04 PM

0 votes

1 answers

200 views

Switch to master auto position from master_log_file and master_log_pos MySQL 5.7.44

mysql replication backup mysql-5.7 master-slave-replication

MySQL Server - 5.7.44 I have a master and three slaves running on MySQL 5.7.44. The GTID is enabled and I am using gtid_enforce_consistency. I took a backup from a slave and built a fourth slave using `xtrabackup`. Following is the command I used to backup the data. xtrabackup --defaults-file=/etc/m...

                                  MySQL Server - 5.7.44

I have a master and three slaves running on MySQL 5.7.44.
The GTID is enabled and I am using gtid_enforce_consistency.

I took a backup from a slave and built a fourth slave using xtrabackup. Following is the command I used to backup the data.

    xtrabackup --defaults-file=/etc/my.cnf --user="bkp_user" --password="password" --port=3306 --socket=/var/lib/mysql/mysql.sock --backup --use-memory=2048MB --tmpdir=/tmp/

after copying the backup to the data directory, I applied the logs

    xtrabackup --apply-log --use-memory=10G . 

and then once the logs are applied, after changing the ownership of the files, I started the new slave. I added skip-slave-start in the config file before starting the new slave.

I, then, used the following command to connect the slave to the server from where I had taken backup.

    change master to master_host='IP', MASTER_USER='replica', MASTER_PASSWORD='password', MASTER_AUTO_POSITION=1;

I kept receiving the Duplicate entry errors 1062. I skipped one GTID and started the replication, but it kept happening and I kept receiving 1062 error.  

    LAST_ERROR_MESSAGE: Worker 1 failed executing transaction '8e36a7e9-4bb6-11ea-9989-0aab56f5ae8e:4967477654' at master log binlog.000454, end_log_pos 638; Could not execute Write_rows event on table db.table; Duplicate entry '97193728' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 638
     LAST_ERROR_TIMESTAMP: 2024-11-22 22:38:18

I had to switch back to BINLOG POSITION based replication. 

    STOP SLAVE;
    RESET SLAVE ALL;
    change master to master_host='IP', MASTER_USER='replica', MASTER_PASSWORD='password', MASTER_LOG_FILE='binlog.000454', MASTER_LOG_POS=772253840;

And replication is working fine. I am out of thought, what did I miss? Also how do I safely switch back to MASTER_AUTO_POSITION?

This link gives a simple procedure to do so, but really not sure and haven't tried it yet
https://thedataguy.in/mysql-convert-binlog-based-replication-to-gtid-replication-without-downtime/ 

This percona blog talks about doing that on MariaDB, reading, searching further

https://www.percona.com/blog/convert-mariadb-binary-log-file-and-position-based-replication-to-gtid-replication/

Avinash Pawar (216 rep)

Dec 1, 2024, 12:03 AM • Last activity: Jul 13, 2025, 01:04 AM

0 votes

1 answers

162 views

MySQL NDB Cluster Community - High memory utilization by mysqld only on source/replica nodes

memory mysql-8.0 master-slave-replication ndbcluster

I have two MySQL NDB community clusters 8.0.30, each with 3 VMs Cluster#1_VM1: mysqld + ndb_mgmd Cluster#1_VM2: mysqld + ndb Cluster#1_VM3: mysqld + ndb Cluster#2_VM1: mysqld + ndb_mgmd Cluster#2_VM2: mysqld + ndb Cluster#2_VM3: mysqld + ndb I have bidirectional replication between Cluster#1_VM2 and...

                                  I have two MySQL NDB community clusters 8.0.30, each with 3 VMs  
Cluster#1_VM1: mysqld + ndb_mgmd  
Cluster#1_VM2: mysqld + ndb  
Cluster#1_VM3: mysqld + ndb  

Cluster#2_VM1: mysqld + ndb_mgmd  
Cluster#2_VM2: mysqld + ndb  
Cluster#2_VM3: mysqld + ndb  

I have bidirectional replication between Cluster#1_VM2 and Cluster#2_VM2.  
VM2/3 have identical parameters.

The choice of such architecture might not be perfect but I think it was not relevant here.


mysqld process consumes a lot of memory on VMs with replication.
And replication seems to be the only difference between those VMs.
(Though I'm not quite sure how is load distributed among mysqld processes)
I would assume that ndbd would be the one to use most of the RAM since it has 8GB of DataMemory. But somehow mysqld is also utilizing a lot of RAM too.

I've checked several other questions here (e.g. https://dba.stackexchange.com/questions/286946/high-memory-usage-on-replica-mysql-8  )
Most of them are related to innodb and the queries for e.g. buffer utilization show nothing that could show me where that high memory utilization comes from.

So far I've checked 
- buffer utilizations for innoDB (I have NDB, I know)
- engine status for innodb and ndb - no hint of high mem usage, or at least not clearly visible

All I know for now is that it grows over time after restart of the process.
But don't know yet how to limit/control the amount of memory consumed by mysqld in this case.
                                

Marcin Stolzmann (1 rep)

Sep 14, 2023, 11:34 AM • Last activity: Jul 12, 2025, 08:02 AM

0 votes

1 answers

169 views

GCP Cloudsql Mysql Replica unresponsive after mass delete on Master

mysql mysql-5.7 master-slave-replication google-cloud-sql

Directed here from [S/O][2] We have a Master/Replica configuration for Mysql innodb(5.7.32) databases in Cloud SQL... We have a single Table (let's call it Master table) partitioned on Two keys having both primary and non-clustered indexes... It's a row-based replication with automatic disk increase...

                                  Directed here from S/O 

We have a Master/Replica configuration for Mysql innodb(5.7.32) databases in Cloud SQL... We have a single Table (let's call it Master table) partitioned on Two keys having both primary and non-clustered indexes... It's a row-based replication with automatic disk increase on both instances... It's not a HA configuration so it's not a failover replica...

*What we're trying to do...* We're trying to purge the master table back to N number of days... This is done for multiple reasons so let's say this is a client requirement...

*What's the issue...* Whenever we're purging the master table it just stalls the replica, it deletes a certain number of rows on the replica and then just passes out... The number of records in a single purge is around 5 million rows...  It's a totally repeatable issue... we know it's caused because it's a row-based, sequential replication so

*What we've tried so far...*

 1. Increasing the size of the replica, we've given it 104 GB RAM but the lag doesn't go...
 2. Restarting replica
 3. RESET SLAVE
 4. Trying enabling parallel replication https://cloud.google.com/sql/docs/mysql/replication/manage-replicas#configuring-parallel-replication ... every single time I tried this it failed with an 'Unknown error occurred'...
 5. Trying setting it to a Statement-based replication by the SET binlog_format="STATEMENT" command but the "root" user doesn't have the privilege and gets an 'access denied' error...

Now the question... *what am I missing in my:*

 1. explanation
 2. mysql configuration
 3. method

Thanks

Faraz Beg (11 rep)

May 6, 2021, 09:16 PM • Last activity: Jul 9, 2025, 12:05 AM

2 votes

2 answers

2438 views

Importing a large mysql dump while replication is running

mysql mysqldump mysql-5.1 master-slave-replication

So we have a simple master/slave mysql replication running between two CentOS Servers. The master has multiple databases. eg.. Database1 Database2 Database3 The issue is we have a mysql dumpfile of a new 60GB database (Database4). What's the best way to import Database4 without breaking replication?...

                                  So we have a simple master/slave mysql replication running between two CentOS Servers.

The master has multiple databases. eg.. 
Database1 
Database2 
Database3 

The issue is we have a mysql dumpfile of a new 60GB database (Database4).

What's the best way to import Database4 without breaking replication?

I was thinking we could stop replication, and import the mysqldump onto both master and slave. Then restart replication, but was hoping there was an alternate way that would minimize downtime.

user125340 (21 rep)

May 22, 2017, 09:58 PM • Last activity: Jun 25, 2025, 06:08 PM

0 votes

1 answers

198 views

What will happen if master goes down in peer to peer replication?

sql-server-2012 replication ssms master-slave-replication

In case of Peer to peer replication, if Master goes down, will the other two nodes continue synchronizing or not?[![enter image description here][1]][1] [1]: https://i.sstatic.net/c8J6I.jpg

                                  In case of Peer to peer replication, if Master goes down, will the other two nodes continue synchronizing or not?
                                

Rauf Asadov (1313 rep)

Jun 21, 2019, 10:49 AM • Last activity: Jun 23, 2025, 01:03 PM

1 votes

1 answers

1378 views

PostgreSQL - Replica Recovery Process Stuck frequently (epoll_wait taking too much time)

postgresql master-slave-replication

We have Master Slave Setup with Asynchronous Streaming Replication (Replication Slots). Frequently our Replica's entire Postgres processes seems to be doing nothing (or stuck?). Since replication_slots is used for replication, this has been affecting the master with pg_xlog getting backlogged. postg...

                                  We have Master Slave Setup with Asynchronous Streaming Replication (Replication Slots). Frequently our Replica's entire Postgres processes seems to be doing nothing (or stuck?). Since replication_slots is used for replication, this has been affecting the master with pg_xlog getting backlogged.

    postgres=# select version();
                                                                   version                                                                
    --------------------------------------------------------------------------------------------------------------------------------------
     PostgreSQL 9.6.11 on x86_64-pc-linux-gnu (Ubuntu 9.6.11-1.pgdg18.04+1), compiled by gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0, 64-bit
    (1 row)

As per Slave postgres log, last received wal is processed and the next restartpoint is not starting. No logs after this. Just simply stuck. After restarting, replica starts to catch up, but it again gets stuck after some time.

    2018-11-23 12:47:26.625 UTC  LOG:  restartpoint starting: xlog
    2018-11-23 12:49:48.258 UTC  LOG:  restartpoint complete: wrote 10057 buffers (10.3%); 0 transaction log file(s) added, 198 removed, 0 recycled; write=130.568 s, sync=0.702 s, total=141.633 s; sync files=1874, longest=0.205 s, average=0.000 s; distance=5435211 kB, estimate=5435211 kB
    2018-11-23 12:49:48.258 UTC  LOG:  recovery restart point at 1143/8B16D800
    2018-11-23 12:49:48.258 UTC  DETAIL:  last completed transaction was at log time 2018-11-23 07:00:19.510771+00
    2018-11-23 12:51:04.092 UTC  LOG:  restartpoint starting: xlog
    2018-11-23 12:55:34.744 UTC  LOG:  restartpoint complete: wrote 3086 buffers (3.2%); 0 transaction log file(s) added, 120 removed, 212 recycled; write=269.875 s, sync=0.029 s, total=270.652 s; sync files=4955, longest=0.000 s, average=0.000 s; distance=3238852 kB, estimate=5215575 kB
    2018-11-23 12:55:34.744 UTC  LOG:  recovery restart point at 1144/50C5EA00
    2018-11-23 12:55:34.744 UTC  DETAIL:  last completed transaction was at log time 2018-11-23 07:04:51.819643+00


I analysed trace of Slave Postgres Recovery process during both active and stuck mode, and seems that during stuck mode, recovery process is doing very minimal activity (or stuck).

**perf trace -p pid of Slave Recovery Process**

             ? (     ?   ):  ... [continued]: epoll_wait()) = 0
         0.048 ( 0.023 ms): close(fd: 992                                 ) = 0
         0.100 ( 0.022 ms): read(fd: 8, buf: 0x7fff2ec83e07, count: 1             ) = -1 EAGAIN Resource temporarily unavailable
         0.146 ( 0.024 ms): epoll_create1(flags: 524288                                           ) = 992
         0.192 ( 0.025 ms): epoll_ctl(epfd: 992, op: ADD, fd: 12, event: 0x7fff2ec83d7c) = 0
         0.240 ( 0.023 ms): epoll_ctl(epfd: 992, op: ADD, fd: 8, event: 0x7fff2ec83d7c) = 0
         0.284 ( 0.020 ms): gettimeofday(tv: 0x7fff2ec83d50                                       ) = 0
         0.323 (5003.804 ms): epoll_wait(epfd: 992, events: 0x55fb9372ae08, maxevents: 1, timeout: 5000) = 0
184 ( 0.023 ms): close(fd: 992                                 ) = 0
233 ( 0.023 ms): read(fd: 8, buf: 0x7fff2ec83e07, count: 1             ) = -1 EAGAIN Resource temporarily unavailable
281 ( 0.025 ms): epoll_create1(flags: 524288                                           ) = 992
329 ( 0.025 ms): epoll_ctl(epfd: 992, op: ADD, fd: 12, event: 0x7fff2ec83d7c) = 0
374 ( 0.021 ms): epoll_ctl(epfd: 992, op: ADD, fd: 8, event: 0x7fff2ec83d7c) = 0
416 ( 0.033 ms): gettimeofday(tv: 0x7fff2ec83d50                                       ) = 0
465 (5005.223 ms): epoll_wait(epfd: 992, events: 0x55fb9372ae08, maxevents: 1, timeout: 5000) = 0
736 ( 0.025 ms): close(fd: 992                                 ) = 0
788 ( 0.024 ms): read(fd: 8, buf: 0x7fff2ec83e07, count: 1             ) = -1 EAGAIN Resource temporarily unavailable
838 ( 0.025 ms): epoll_create1(flags: 524288                                           ) = 992
887 ( 0.075 ms): epoll_ctl(epfd: 992, op: ADD, fd: 12, event: 0x7fff2ec83d7c) = 0
004 ( 0.025 ms): epoll_ctl(epfd: 992, op: ADD, fd: 8, event: 0x7fff2ec83d7c) = 0
114 ( 0.032 ms): gettimeofday(tv: 0x7fff2ec83d50                                       ) = 0
205 (5005.241 ms): epoll_wait(epfd: 992, events: 0x55fb9372ae08, maxevents: 1, timeout: 5000) = 0
499 ( 0.024 ms): close(fd: 992                                 ) = 0
554 ( 0.024 ms): read(fd: 8, buf: 0x7fff2ec83e07, count: 1             ) = -1 EAGAIN Resource temporarily unavailable
604 ( 0.028 ms): epoll_create1(flags: 524288                                           ) = 992
692 ( 0.104 ms): epoll_ctl(epfd: 992, op: ADD, fd: 12, event: 0x7fff2ec83d7c) = 0
824 ( 0.023 ms): epoll_ctl(epfd: 992, op: ADD, fd: 8, event: 0x7fff2ec83d7c) = 0
871 ( 0.025 ms): gettimeofday(tv: 0x7fff2ec83d50                                       ) = 0
919 (5000.618 ms): epoll_wait(epfd: 992, events: 0x55fb9372ae08, maxevents: 1, timeout: 5000) = 0
588 ( 0.022 ms): close(fd: 992                                 ) = 0
641 ( 0.024 ms): read(fd: 8, buf: 0x7fff2ec83e07, count: 1             ) = -1 EAGAIN Resource temporarily unavailable
691 ( 0.027 ms): epoll_create1(flags: 524288                                           ) = 992
744 ( 0.022 ms): epoll_ctl(epfd: 992, op: ADD, fd: 12, event: 0x7fff2ec83d7c) = 0
787 ( 0.021 ms): epoll_ctl(epfd: 992, op: ADD, fd: 8, event: 0x7fff2ec83d7c) = 0
831 ( 0.020 ms): gettimeofday(tv: 0x7fff2ec83d50                                       ) = 0
867 (5005.231 ms): epoll_wait(epfd: 992, events: 0x55fb9372ae08, maxevents: 1, timeout: 5000) = 0
142 ( 0.023 ms): close(fd: 992                                 ) = 0
196 ( 0.024 ms): read(fd: 8, buf: 0x7fff2ec83e07, count: 1             ) = -1 EAGAIN Resource temporarily unavailable
246 ( 0.028 ms): epoll_create1(flags: 524288                                           ) = 992
296 ( 0.023 ms): epoll_ctl(epfd: 992, op: ADD, fd: 12, event: 0x7fff2ec83d7c) = 0
340 ( 0.023 ms): epoll_ctl(epfd: 992, op: ADD, fd: 8, event: 0x7fff2ec83d7c) = 0
388 ( 0.020 ms): gettimeofday(tv: 0x7fff2ec83d50                                       ) = 0
428 (5005.228 ms): epoll_wait(epfd: 992, events: 0x55fb9372ae08, maxevents: 1, timeout: 5000) = 0
705 ( 0.025 ms): close(fd: 992                                 ) = 0
761 ( 0.025 ms): read(fd: 8, buf: 0x7fff2ec83e07, count: 1             ) = -1 EAGAIN Resource temporarily unavailable
811 ( 0.028 ms): epoll_create1(flags: 524288                                           ) = 992
864 ( 0.022 ms): epoll_ctl(epfd: 992, op: ADD, fd: 12, event: 0x7fff2ec83d7c) = 0
907 ( 0.020 ms): epoll_ctl(epfd: 992, op: ADD, fd: 8, event: 0x7fff2ec83d7c) = 0
949 ( 0.022 ms): gettimeofday(tv: 0x7fff2ec83d50                                       ) = 0
988 (5005.233 ms): epoll_wait(epfd: 992, events: 0x55fb9372ae08, maxevents: 1, timeout: 5000) = 0
275 ( 0.025 ms): close(fd: 992                                 ) = 0
333 ( 0.024 ms): read(fd: 8, buf: 0x7fff2ec83e07, count: 1             ) = -1 EAGAIN Resource temporarily unavailable
382 ( 0.029 ms): epoll_create1(flags: 524288                                           ) = 992
435 ( 0.077 ms): epoll_ctl(epfd: 992, op: ADD, fd: 12, event: 0x7fff2ec83d7c) = 0
588 ( 0.046 ms): epoll_ctl(epfd: 992, op: ADD, fd: 8, event: 0x7fff2ec83d7c) = 0
663 ( 0.023 ms): gettimeofday(tv: 0x7fff2ec83d50                                       ) = 0
762 (5005.255 ms): epoll_wait(epfd: 992, events: 0x55fb9372ae08, maxevents: 1, timeout: 5000) = 0
070 ( 0.026 ms): close(fd: 992                                 ) = 0
129 ( 0.025 ms): read(fd: 8, buf: 0x7fff2ec83e07, count: 1             ) = -1 EAGAIN Resource temporarily unavailable
179 ( 0.031 ms): epoll_create1(flags: 524288                                           ) = 992
233 ( 0.078 ms): epoll_ctl(epfd: 992, op: ADD, fd: 12, event: 0x7fff2ec83d7c) = 0
384 ( 0.047 ms): epoll_ctl(epfd: 992, op: ADD, fd: 8, event: 0x7fff2ec83d7c) = 0
464 ( 0.023 ms): gettimeofday(tv: 0x7fff2ec83d50                                       ) = 0
562 (5005.258 ms): epoll_wait(epfd: 992, events: 0x55fb9372ae08, maxevents: 1, timeout: 5000) = 0
876 ( 0.025 ms): close(fd: 992                                 ) = 0
933 ( 0.024 ms): read(fd: 8, buf: 0x7fff2ec83e07, count: 1             ) = -1 EAGAIN Resource temporarily unavailable
984 ( 0.029 ms): epoll_create1(flags: 524288                                           ) = 992
078 ( 0.118 ms): epoll_ctl(epfd: 992, op: ADD, fd: 12, event: 0x7fff2ec83d7c) = 0
225 ( 0.091 ms): epoll_ctl(epfd: 992, op: ADD, fd: 8, event: 0x7fff2ec83d7c) = 0
350 ( 0.027 ms): gettimeofday(tv: 0x7fff2ec83d50                                       ) = 0
441 (3034.775 ms): epoll_wait(epfd: 992, events: 0x55fb9372ae08, maxevents: 1, timeout: 5000) = -1 EINTR Interrupted system call
277 ( 0.028 ms): write(fd: 13, buf: 0x7fff2ec83737, count: 1           ) = 1
369 ( 0.021 ms): gettimeofday(tv: 0x7fff2ec83d60                                       ) = 0
413 ( 0.024 ms): close(fd: 992                                 ) = 0
468 ( 0.023 ms): read(fd: 8, buf: 0x7fff2ec83e07, count: 1             ) = -1 EAGAIN Resource temporarily unavailable
515 ( 0.023 ms): gettimeofday(tv: 0x7fff2ec83e00                                       ) = 0
561 ( 0.024 ms): lseek(fd: 3, offset: 2490368, whence: SET) = 2490368
607 ( 0.036 ms): read(fd: 3, buf: 0x55fb9371ca98, count: 8192) = 8192
675 ( 0.024 ms): read(fd: 8, buf: 0x7fff2ec84457, count: 1             ) = -1 EAGAIN Resource temporarily unavailable
729 ( 0.045 ms): lseek(fd: 652, whence: END) = 300539904
798 ( 0.025 ms): lseek(fd: 652, whence: END) = 300539904
895 ( 0.053 ms): kill(pid: 11330 (postgres), sig: USR1                                 ) = 0
211 ( 0.132 ms): write(fd: 652, buf: 0x7f5d2e0cc500, count: 8192) = 8192
377 ( 0.025 ms): lseek(fd: 3, offset: 2498560, whence: SET) = 2498560
425 ( 0.032 ms): read(fd: 3, buf: 0x55fb9371ca98, count: 8192) = 8192
498 ( 0.043 ms): read(fd: 8, buf: 0x7fff2ec84457, count: 1             ) = -1 EAGAIN Resource temporarily unavailable
576 ( 0.024 ms): lseek(fd: 652, whence: END) = 300548096
624 ( 0.024 ms): lseek(fd: 652, whence: END) = 300548096
687 ( 0.122 ms): write(fd: 652, buf: 0x7f5d2e0ce500, count: 8192) = 8192
853 ( 0.025 ms): lseek(fd: 3, offset: 2506752, whence: SET) = 2506752
902 ( 0.041 ms): read(fd: 3, buf: 0x55fb9371ca98, count: 8192) = 8192
985 ( 0.025 ms): read(fd: 8, buf: 0x7fff2ec84457, count: 1             ) = -1 EAGAIN Resource temporarily unavailable
036 ( 0.049 ms): lseek(fd: 652, whence: END) = 300556288
117 ( 0.038 ms): lseek(fd: 652, whence: END) = 300556288
193 ( 0.064 ms): write(fd: 652, buf: 0x7f5d2e0d0500, count: 8192) = 8192
286 ( 0.024 ms): lseek(fd: 3, offset: 2514944, whence: SET) = 2514944
353 ( 0.039 ms): read(fd: 3, buf: 0x55fb9371ca98, count: 8192) = 8192
419 ( 0.024 ms): read(fd: 8, buf: 0x7fff2ec84457, count: 1             ) = -1 EAGAIN Resource temporarily unavailable
467 ( 0.024 ms): lseek(fd: 652, whence: END) = 300564480
540 ( 0.034 ms): lseek(fd: 652, whence: END) = 300564480
613 ( 0.057 ms): write(fd: 652, buf: 0x7f5d2e0d2500, count: 8192) = 8192
723 ( 0.044 ms): read(fd: 8, buf: 0x7fff2ec84457, count: 1             ) = -1 EAGAIN Resource temporarily unavailable
801 ( 0.040 ms): lseek(fd: 652, whence: END) = 300572672
874 ( 0.025 ms): lseek(fd: 652, whence: END) = 300572672
936 ( 0.061 ms): write(fd: 652, buf: 0x7f5d2e0d4500, count: 8192) = 8192
026 ( 0.024 ms): lseek(fd: 3, offset: 2523136, whence: SET) = 2523136
073 ( 0.054 ms): read(fd: 3, buf: 0x55fb9371ca98, count: 8192) = 8192
156 ( 0.040 ms): read(fd: 8, buf: 0x7fff2ec84457, count: 1             ) = -1 EAGAIN Resource temporarily unavailable
229 ( 0.033 ms): lseek(fd: 652, whence: END) = 300580864
295 ( 0.038 ms): lseek(fd: 652, whence: END) = 300580864
385 ( 0.085 ms): write(fd: 652, buf: 0x7f5d2e0d6500, count: 8192) = 8192
507 ( 0.045 ms): lseek(fd: 3, offset: 2531328, whence: SET) = 2531328
597 ( 0.039 ms): read(fd: 3, buf: 0x55fb9371ca98, count: 8192) = 8192
674 ( 0.033 ms): read(fd: 8, buf: 0x7fff2ec84457, count: 1             ) = -1 EAGAIN Resource temporarily unavailable
754 ( 0.041 ms): lseek(fd: 652, whence: END) = 300589056
839 ( 0.048 ms): lseek(fd: 652, whence: END) = 300589056
955 ( 0.086 ms): write(fd: 652, buf: 0x7f5d2e0d8500, count: 8192) = 8192
078 ( 0.096 ms): lseek(fd: 3, offset: 2539520, whence: SET) = 2539520
213 ( 0.071 ms): read(fd: 3, buf: 0x55fb9371ca98, count: 8192) = 8192
355 ( 0.041 ms): read(fd: 8, buf: 0x7fff2ec84457, count: 1             ) = -1 EAGAIN Resource temporarily unavailable
443 ( 0.052 ms): lseek(fd: 652, whence: END) = 300597248
528 ( 0.056 ms): lseek(fd: 652, whence: END) = 300597248
642 ( 0.103 ms): write(fd: 652, buf: 0x7f5d2e0da500, count: 8192) = 8192
809 ( 0.048 ms): lseek(fd: 3, offset: 2547712, whence: SET) = 2547712
901 ( 0.048 ms): read(fd: 3, buf: 0x55fb9371ca98, count: 8192) = 8192
015 ( 0.036 ms): read(fd: 8, buf: 0x7fff2ec84457, count: 1             ) = -1 EAGAIN Resource temporarily unavailable
095 ( 0.032 ms): lseek(fd: 652, whence: END) = 300605440
164 ( 0.041 ms): lseek(fd: 652, whence: END) = 300605440
266 ( 0.103 ms): write(fd: 652, buf: 0x7f5d2e0dc500, count: 8192) = 8192
419 ( 0.047 ms): lseek(fd: 3, offset: 2555904, whence: SET) = 2555904
505 ( 0.064 ms): read(fd: 3, buf: 0x55fb9371ca98, count: 8192) = 8192
612 ( 0.036 ms): read(fd: 8, buf: 0x7fff2ec84457, count: 1             ) = -1 EAGAIN Resource temporarily unavailable
707 ( 0.055 ms): lseek(fd: 652, whence: END) = 300613632
805 ( 0.052 ms): lseek(fd: 652, whence: END) = 300613632
893 ( 0.093 ms): write(fd: 652, buf: 0x7f5d2e0de500, count: 8192) = 8192
027 ( 0.048 ms): lseek(fd: 3, offset: 2564096, whence: SET) = 2564096
120 ( 0.071 ms): read(fd: 3, buf: 0x55fb9371ca98, count: 8192) = 8192
247 ( 0.048 ms): read(fd: 8, buf: 0x7fff2ec84457, count: 1             ) = -1 EAGAIN Resource temporarily unavailable
342 ( 0.040 ms): lseek(fd: 652, whence: END) = 300621824
421 ( 0.041 ms): lseek(fd: 652, whence: END) = 300621824
499 ( 0.103 ms): write(fd: 652, buf: 0x7f5d2e0e0500, count: 8192) = 8192
641 ( 0.053 ms): lseek(fd: 3, offset: 2572288, whence: SET) = 2572288
740 ( 0.060 ms): read(fd: 3, buf: 0x55fb9371ca98, count: 8192) = 8192
871 ( 0.041 ms): read(fd: 8, buf: 0x7fff2ec84457, count: 1             ) = -1 EAGAIN Resource temporarily unavailable
966 ( 0.026 ms): lseek(fd: 652, whence: END) = 300630016
025 ( 0.031 ms): lseek(fd: 652, whence: END) = 300630016
135 ( 0.084 ms): write(fd: 652, buf: 0x7f5d2e0e2500, count: 8192) = 8192
281 ( 0.055 ms): lseek(fd: 3, offset: 2580480, whence: SET) = 2580480
392 ( 0.061 ms): read(fd: 3, buf: 0x55fb9371ca98, count: 8192) = 8192
516 ( 0.040 ms): read(fd: 8, buf: 0x7fff2ec84457, count: 1             ) = -1 EAGAIN Resource temporarily unavailable
594 ( 0.043 ms): lseek(fd: 652, whence: END) = 300638208
669 ( 0.048 ms): lseek(fd: 652, whence: END) = 300638208
768 ( 0.102 ms): write(fd: 652, buf: 0x7f5d2e0e4500, count: 8192) = 8192
915 ( 0.057 ms): lseek(fd: 3, offset: 2588672, whence: SET) = 2588672
026 ( 0.059 ms): read(fd: 3, buf: 0x55fb9371ca98, count: 8192) = 8192
156 ( 0.041 ms): read(fd: 8, buf: 0x7fff2ec84457, count: 1             ) = -1 EAGAIN Resource temporarily unavailable
244 ( 0.032 ms): lseek(fd: 652, whence: END) = 300646400
321 ( 0.054 ms): lseek(fd: 652, whence: END) = 300646400
413 ( 0.102 ms): write(fd: 652, buf: 0x7f5d2e0e6500, count: 8192) = 8192
554 ( 0.046 ms): lseek(fd: 3, offset: 2596864, whence: SET) = 2596864
655 ( 0.061 ms): read(fd: 3, buf: 0x55fb9371ca98, count: 8192) = 8192
779 ( 0.041 ms): read(fd: 8, buf: 0x7fff2ec84457, count: 1             ) = -1 EAGAIN Resource temporarily unavailable
865 ( 0.056 ms): lseek(fd: 652, whence: END) = 300654592
954 ( 0.036 ms): lseek(fd: 652, whence: END) = 300654592
049 ( 0.050 ms): write(fd: 652, buf: 0x7f5d2e0e8500, count: 8192) = 8192
153 ( 0.037 ms): read(fd: 8, buf: 0x7fff2ec84457, count: 1             ) = -1 EAGAIN Resource temporarily unavailable
261 ( 0.049 ms): lseek(fd: 652, whence: END) = 300662784
355 ( 0.071 ms): lseek(fd: 652, whence: END) = 300662784
463 ( 0.083 ms): write(fd: 652, buf: 0x7f5d2e0ea500, count: 8192) = 8192
616 ( 0.054 ms): lseek(fd: 3, offset: 2605056, whence: SET) = 2605056
724 ( 0.064 ms): read(fd: 3, buf: 0x55fb9371ca98, count: 8192) = 8192
847 ( 0.044 ms): read(fd: 8, buf: 0x7fff2ec84457, count: 1             ) = -1 EAGAIN Resource temporarily unavailable
951 ( 0.051 ms): lseek(fd: 652, whence: END) = 300670976
046 ( 0.044 ms): lseek(fd: 652, whence: END) = 300670976
150 ( 0.089 ms): write(fd: 652, buf: 0x7f5d2e0ec500, count: 8192) = 8192
297 ( 0.045 ms): lseek(fd: 3, offset: 2613248, whence: SET) = 2613248
393 ( 0.039 ms): read(fd: 3, buf: 0x55fb9371ca98, count: 8192) = 8192
483 ( 0.038 ms): read(fd: 8, buf: 0x7fff2ec84457, count: 1             ) = -1 EAGAIN Resource temporarily unavailable
574 ( 0.056 ms): lseek(fd: 652, whence: END) = 300679168
672 ( 0.046 ms): lseek(fd: 652, whence: END) = 300679168
786 ( 0.182 ms): write(fd: 652, buf: 0x7f5d2e0ee500, count: 8192) = 8192
030 ( 0.059 ms): epoll_create1(flags: 524288                                           ) = 992
135 ( 0.056 ms): epoll_ctl(epfd: 992, op: ADD, fd: 12, event: 0x7fff2ec83d7c) = 0
235 ( 0.056 ms): epoll_ctl(epfd: 992, op: ADD, fd: 8, event: 0x7fff2ec83d7c) = 0
336 ( 0.051 ms): gettimeofday(tv: 0x7fff2ec83d50                                       ) = 0
431 ( 0.055 ms): epoll_wait(epfd: 992, events: 0x55fb9372ae08, maxevents: 1, timeout: 5000) = 1
542 ( 0.034 ms): read(fd: 12, buf: 0x7fff2ec83d70, count: 16           ) = 1
629 ( 0.044 ms): gettimeofday(tv: 0x7fff2ec83d60                                       ) = 0
717 (5005.261 ms): epoll_wait(epfd: 992, events: 0x55fb9372ae08, maxevents: 1, timeout: 5000) = 0
027 ( 0.023 ms): close(fd: 992                                 ) = 0
079 ( 0.024 ms): read(fd: 8, buf: 0x7fff2ec83e07, count: 1             ) = -1 EAGAIN Resource temporarily unavailable
127 ( 0.026 ms): epoll_create1(flags: 524288                                           ) = 992
177 ( 0.025 ms): epoll_ctl(epfd: 992, op: ADD, fd: 12, event: 0x7fff2ec83d7c) = 0
222 ( 0.021 ms): epoll_ctl(epfd: 992, op: ADD, fd: 8, event: 0x7fff2ec83d7c) = 0
264 ( 0.033 ms): gettimeofday(tv: 0x7fff2ec83d50                                       ) = 0
313 (5005.226 ms): epoll_wait(epfd: 992, events: 0x55fb9372ae08, maxevents: 1, timeout: 5000) = 0
587 ( 0.022 ms): close(fd: 992                                 ) = 0
638 ( 0.030 ms): read(fd: 8, buf: 0x7fff2ec83e07, count: 1             ) = -1 EAGAIN Resource temporarily unavailable
690 ( 0.024 ms): epoll_create1(flags: 524288                                           ) = 992
746 ( 0.025 ms): epoll_ctl(epfd: 992, op: ADD, fd: 12, event: 0x7fff2ec83d7c) = 0
795 ( 0.033 ms): epoll_ctl(epfd: 992, op: ADD, fd: 8, event: 0x7fff2ec83d7c) = 0
850 ( 0.022 ms): gettimeofday(tv: 0x7fff2ec83d50                                       ) = 0
905 (5005.237 ms): epoll_wait(epfd: 992, events: 0x55fb9372ae08, maxevents: 1, timeout: 5000) = 0
190 ( 0.023 ms): close(fd: 992                                 ) = 0
243 ( 0.025 ms): read(fd: 8, buf: 0x7fff2ec83e07, count: 1             ) = -1 EAGAIN Resource temporarily unavailable
293 ( 0.026 ms): epoll_create1(flags: 524288                                           ) = 992
342 ( 0.027 ms): epoll_ctl(epfd: 992, op: ADD, fd: 12, event: 0x7fff2ec83d7c) = 0
389 ( 0.021 ms): epoll_ctl(epfd: 992, op: ADD, fd: 8, event: 0x7fff2ec83d7c) = 0
431 ( 0.032 ms): gettimeofday(tv: 0x7fff2ec83d50                                       ) = 0

**Seems like epoll_wait is taking too much time of each operation**. Is there anything needs to be done to address that? Or any other detailed stacktrace needs to be analysed?

**Infrastructure**

 - AWS EC2 r4.large
 - ZFS FileSystem
 - RAID0 - 8 x 17GB
 - Number of databases - 130+
 - Heavy ETL Process

                                

The Coder (299 rep)

Nov 23, 2018, 12:42 PM • Last activity: Jun 18, 2025, 02:01 AM

2 votes

0 answers

56 views

MariaDB master feeds slave's Slave_IO with a low rate

mariadb master-slave-replication mariadb-11

I have a pair of identical servers: Ubuntu 22.04 LTS with MariaDB 11.7.2. 32 CPU cores, 128 GB RAM, 2x2TB NVME, i210 gigabit ethernet A classic master-slave replication (logfile/logpos) configured. Everything work pretty fine except slave slowly lagging behind master. Slave is in the state: ``` Slav...

Slave_SQL_State: Slave has read all relay log; waiting for more updates
       Slave_IO_State: Waiting for master to send event
      Master_Log_File: log-bin.062358
  Read_Master_Log_Pos: 43451522
    . . . . 
Seconds_Behind_Master: 0

Master is in the state:

1201054 repl 1.2.3.4:45678  NULL  Binlog Dump  81402  Writing to net  NULL  0.000

The problem is that the newest master binlog file is log-bin.069669 - 7200+ chunks ahead, 100MB each. So slave is 700GB+ behind the master. There are LOT of updates on the master, approx 500MB of binlog per minute. 500MB/m = 4000Mb/m = 70Mb/s that is way lower than available 1Gb/s AVG load on the master is quite low. Ping between servers is 25ms. I have changed binlog_row_image variable from FULL to MINIMAL (an advice from here - https://dba.stackexchange.com/a/289768/7895 ) with no visible effect. The only mystic symptom is that slave shows zero seconds behind master most of time and sometimes i'm lucky enough to see a real lag with show slave status\G. Has anyone encountered a similar problem? What was the cause, and how did you overcome it? UPDATE ------ Master:

MariaDB [(none)]> show master status;
+----------------+----------+-----------------+------------------+
| File           | Position | Binlog_Do_DB    | Binlog_Ignore_DB |
+----------------+----------+-----------------+------------------+
| log-bin.073714 | 96274193 | aaa,bbb,ccc,ddd |                  |
+----------------+----------+-----------------+------------------+
1 row in set (0.000 sec)

Slave:

MariaDB [(none)]> show all slaves status \G
*************************** 1. row ***************************
               Connection_name:
               Slave_SQL_State: Slave has read all relay log; waiting for more updates
                Slave_IO_State: Waiting for master to send event
                   Master_Host: 1.2.3.4
                   Master_User: repl
                   Master_Port: 3306
                 Connect_Retry: 60
               Master_Log_File: log-bin.065969
           Read_Master_Log_Pos: 65381018
                Relay_Log_File: relay-bin.012836
                 Relay_Log_Pos: 6879217
         Relay_Master_Log_File: log-bin.065969
              Slave_IO_Running: Yes
             Slave_SQL_Running: Yes
               Replicate_Do_DB:
           Replicate_Ignore_DB:
            Replicate_Do_Table:
        Replicate_Ignore_Table:
       Replicate_Wild_Do_Table:
   Replicate_Wild_Ignore_Table:
                    Last_Errno: 0
                    Last_Error:
                  Skip_Counter: 0
           Exec_Master_Log_Pos: 65345606
               Relay_Log_Space: 18336490
               Until_Condition: None
                Until_Log_File:
                 Until_Log_Pos: 0
            Master_SSL_Allowed: Yes
            Master_SSL_CA_File:
            Master_SSL_CA_Path:
               Master_SSL_Cert:
             Master_SSL_Cipher:
                Master_SSL_Key:
         Seconds_Behind_Master: 0
 Master_SSL_Verify_Server_Cert: Yes
                 Last_IO_Errno: 0
                 Last_IO_Error:
                Last_SQL_Errno: 0
                Last_SQL_Error:
   Replicate_Ignore_Server_Ids:
              Master_Server_Id: 7
                Master_SSL_Crl:
            Master_SSL_Crlpath:
                    Using_Gtid: Slave_Pos
                   Gtid_IO_Pos: 0-7-1737276444
       Replicate_Do_Domain_Ids:
   Replicate_Ignore_Domain_Ids:
                 Parallel_Mode: conservative
                     SQL_Delay: 0
           SQL_Remaining_Delay: NULL
       Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
              Slave_DDL_Groups: 1
Slave_Non_Transactional_Groups: 0
    Slave_Transactional_Groups: 33122121
          Replicate_Rewrite_DB:
          Retried_transactions: 1
            Max_relay_log_size: 1073741824
          Executed_log_entries: 142786513
     Slave_received_heartbeats: 0
        Slave_heartbeat_period: 30.000
                Gtid_Slave_Pos: 0-7-1737276444
        Master_last_event_time: 2025-06-06 19:43:36
         Slave_last_event_time: 2025-06-06 19:43:36
        Master_Slave_time_diff: 0
1 row in set (0.001 sec)

MariaDB [(none)]> show global variables like '%parallel%';
+-------------------------------+--------------+
| Variable_name                 | Value        |
+-------------------------------+--------------+
| slave_domain_parallel_threads | 0            |
| slave_parallel_max_queued     | 131072       |
| slave_parallel_mode           | conservative |
| slave_parallel_threads        | 4            |
| slave_parallel_workers        | 4            |
+-------------------------------+--------------+
5 rows in set (0.001 sec)

UPDATE #2 --------- When I run stop slave; start slave; the lag become decreasing for some time but then everything turned back. The steady lag decreasing was achieved when replication been restarted every minute or so.

I think this proves that the VPN is the cause of the problem. I'm not a VPN guy, can anyone at least give me some keywords to google? UPDATE #3 --------- I have started netcat listener on the slave host: nc -l 23456 | pv -r -b > /dev/null and a feeder on a master side: cat /mnt/repl/log-bin.*[0-9] | pv -L 10M -s 100G -S -r -b | nc -4 1.2.3.4 23456 I got a stable 10MB/s stream so I'm confident now it isn't a networking issue.

Kondybas (4800 rep)

Jun 7, 2025, 10:26 AM • Last activity: Jun 12, 2025, 07:27 AM

0 votes

0 answers

23 views

Infinite Loop in SQL Server Merge Replication When Using Delete + Insert for Detail Records

sql-server replication master-slave-replication merge-replication

I have a typical master-detail table relationship (e.g., Orders and OrderDetails). My application updates the detail records by: 1. Deleting all existing detail records for a given master record. 2. Inserting the new detail records. This update pattern is used because it's easier to replace all deta...

                                  I have a typical master-detail table relationship (e.g., Orders and OrderDetails).
My application updates the detail records by:

 1. Deleting all existing detail records for a given master record.
 2. Inserting the new detail records.

This update pattern is used because it's easier to replace all details than track row-level changes.

I want that the subscriber can update the record in the publisher too. It's why I am using Merge Replication. 

The Problem:\
Once Merge Replication is enabled:

The DELETE + INSERT operations at the Publisher get replicated to the Subscriber.

The Subscriber, having already synced the previous detail set, treats this as a conflict or change, and re-applies its own version.

The replication engine then thinks the Publisher has changed again and pushes the data back.

This cycle continues infinitely, causing either:

Endless insertions of the same data, or

Continuous merge conflict resolution.

The only way to stop it is to delete the subscription, which is not sustainable.

Questions:
Is this an expected behavior in Merge Replication when using Delete-Insert update logic?

Is there a recommended best practice for updating detail records in Merge Replication scenarios to avoid this loop?

Would using UPDATE statements instead of DELETE + INSERT be a reliable workaround?

Are there settings (e.g., conflict resolution rules, rowguid column issues, or tracking settings) that can help avoid this behavior?

Environment:
SQL Server 2017

Merge Replication with bi-directional sync (Publisher  Subscriber)

Replicated tables have rowguid and conflict tracking enabled

Filbert Umbawa (1 rep)

Jun 12, 2025, 02:26 AM • Last activity: Jun 12, 2025, 04:31 AM

0 votes

1 answers

891 views

High Memory Usage on Replica - Mysql 8

mysql replication memory mysql-8.0 master-slave-replication

Hope you're doing well. We've recently moved to Mysql 8 (from Mysql 5.7). We're using AWS RDS Service with MySql. The AWS Team is also looking for a solution btw. Since a few days, we're facing several problem with replication. We never met these problems before with the 5.7 version. First, we had a...

                                  Hope you're doing well.

We've recently moved to Mysql 8 (from Mysql 5.7).

We're using AWS RDS Service with MySql. The AWS Team is also looking for a solution btw.

Since a few days, we're facing several problem with replication. We never met these problems before with the 5.7 version.

First, we had a very import lag among our Master and our two replicas.

This was solved with specifics params on the Parameters group of the replica like :

- slave_parallel_type in LOGICAL_CLOCK mode
- binlog_group_commit_sync_delay with a 10ms delay
- sync_binlog at 0

It seems that the latency is now gone, and it's a good news (i'll wait several days to be sure).

Nevertheless, we're still facing a massive problem with the RAM used on the Replica. 

We can't find the problem, perhaps the buffer, but it must be locked with a threshold.

It's a cycle movement like this :

In Green, the production.

In Orange the small replica.

In Blue the most powerfull replica (As you can see, we tried to upgrade the instance, but it's not a RAM limitation problem).

The problem is the same for both. The memory usage is increasing till the replica has to down.

By the way, the swap is never used.

If you have any clue with this, it will help me a lot!

Thanks for reading!

Have a nice day :)

--- UPDATE ---

It seems that the buffer is full, but I don't know with what.
Is there any garbage collector running ? Because seems not, perhaps something to turn on with Mysql8?

Here some queries showing the size of the allocated memory and a zoom on the innodb memory repartition. We see that I Have 12go on this replica. And they must be full if the memory is decreasing :

> 	   SELECT SUBSTRING_INDEX(event_name,'/',2) AS
>        code_area, FORMAT_BYTES(SUM(current_alloc))
>        AS current_alloc
>        FROM sys.x$memory_global_by_current_bytes
>        GROUP BY SUBSTRING_INDEX(event_name,'/',2)
>        ORDER BY SUM(current_alloc) DESC;

>        SELECT *
>        FROM sys.x$memory_global_by_current_bytes
> 	   WHERE SUBSTRING_INDEX(event_name,'/',2) = "memory/innodb"
>        ORDER BY current_alloc DESC
> 	   LIMIT 10;

NicolasCab (1 rep)

Mar 13, 2021, 09:33 AM • Last activity: Jun 2, 2025, 09:03 AM

0 votes

1 answers

1607 views

find error log for streaming replication

postgresql replication master-slave-replication postgresql-12

Recently the streaming replication fails after I restart the primary. I can not find the record using `select * from pg_stat_replication` Now, I want to see it from the system log file at `/var/log/postgresql#` on the primary. I find the following using `grep -i -m 10 **fatal** postgresql-12-main.lo...

                                  Recently the streaming replication fails after I restart the primary. I can not find the record using select * from pg_stat_replication

Now, I want to see it from the system log file at /var/log/postgresql# on the primary.
I find the following using grep -i -m 10 **fatal** postgresql-12-main.log:

    2022-12-06 01:56:01.890 UTC  FATAL:  remaining connection slots are reserved for non-replication superuser connections
    2022-12-05 11:36:57.024 UTC  LOG:  background worker "logical replication launcher" (PID 27089) exited with exit code 1
    2022-12-05 11:36:57.637 UTC  rep_user@[unknown] LOG:  standby "12/main" is now a synchronous standby with priority 1
    2022-12-06 01:55:27.344 UTC  FATAL:  remaining connection slots are reserved for non-replication superuser connections

I have tried some keywords that might be able to be grep-ed like : replication, slave, ip address of the client and physical. But I do not find anything.

BTW, I do not turn on the postgres log file so there is no file at var/lib/postgresql/12/main/log.

There is no error on the replica side. I have grep keyword like fatal and error. Nothing.

What are the keywords that I can grep to find errors related to replication?

thanks
                                

padjee (337 rep)

Dec 6, 2022, 11:49 AM • Last activity: Jun 2, 2025, 01:01 AM

1 votes

1 answers

249 views

Postgresql streaming replication slow on macOS

postgresql replication postgresql-10 master-slave-replication

I am using PostgreSQL 10.1 on MAC on which I am trying to set up streaming replication. I configured both master and slave to be on the same machine. I find the streaming replication lag to be slower than expected on mac. The same test runs on a Linux Ubuntu 16.04 machine without much lag. I have th...

                                  I am using PostgreSQL 10.1 on MAC on which I am trying to set up streaming replication. I configured both master and slave to be on the same machine. I find the streaming replication lag to be slower than expected on mac. The same test runs on a Linux Ubuntu 16.04 machine without much lag.

I have the following insert script. 

    for i in $(seq 1 1 1000)
    do
    bin/psql postgres -p 8999 -c "Insert into $1 select tz, $i * 127361::bigint, $i::real, random()*12696::bigint from generate_series('01-01-2018'::timestamptz, '02-01-2018'::timestamptz, '30 sec'::interval)tz;"
    echo $i
    done

The lag is measured using the following queries,

    SELECT pg_last_wal_receive_lsn() - pg_last_wal_replay_lsn();
    SELECT (extract(epoch FROM now()) - extract(epoch FROM pg_last_xact_replay_timestamp()))::int;

However, the observation is very unexpected. The lag is increasing from the moment the transactions are started on master. 

    Slave    localhost_9001: 12680304		 1	
    Slave    localhost_9001: 12354168		 1	
    Slave    localhost_9001: 16086800		 1	
    .
    .
    .
    Slave    localhost_9001: 3697460920		 121	
    Slave    localhost_9001: 3689335376		 122	
    Slave    localhost_9001: 3685571296		 122
    .
    .
    .
    .
    Slave    localhost_9001: 312752632		 190	
    Slave    localhost_9001: 308177496		 190	
    Slave    localhost_9001: 303548984		 190
    .
    .
    Slave    localhost_9001: 22810280		 199	
    Slave    localhost_9001: 8255144		 199	
    Slave    localhost_9001: 4214440		 199	
    Slave    localhost_9001: 0		 0	

It took around 4.5 minutes for a single client inserting on a single table to complete on master and another 4 minutes for the slave to catch up. Note that NO simultaneous selects are run other than the script to measure the lag.

I understand that replay in PostgreSQL is pretty simple like, "move a particular block to a location", but I am not sure about this behavior.

I have the following other configurations,

    checkpoint_timeout = 5min		
    max_wal_size = 1GB
    min_wal_size = 80MB


Now, I run the same tests with same configurations on a Linux Ubuntu 16.04 machine, I find the lag perfectly reasonable. 

Am I missing anything?


UPDATE :

MAC Hardware configurations :
16GB RAM, 512GB SSD (Nvme)
Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz

Linux hardware configurations :
16GB RAM, 512GB SSD (Nvme)
Intel(R) Core(TM) i7-6600U CPU @ 2.60GHz
                                

krithikaGopalakrishnan (181 rep)

Dec 17, 2018, 02:30 PM • Last activity: May 24, 2025, 08:05 PM

1 votes

1 answers

291 views

High memory usage in Masters with MariaDB and Vitess

mariadb memory master-slave-replication

I'm running a Kubernetes Cluster with MariaDB `10.3.20` and Vitess 6. Everything is running fine, but the memory usage in the `Master` nodes is always around 90%. I don't have OOM right now, but I was keen to understand this high memory usage. Currently, each node (master and replica) has the same c...

                                  I'm running a Kubernetes Cluster with MariaDB 10.3.20 and Vitess 6. Everything is running fine, but the memory usage in the Master nodes is always around 90%. I don't have OOM right now, but I was keen to understand this high memory usage.

Currently, each node (master and replica) has the same configuration of 10GB RAM and 10 CPU cores. The MariaDB configurations, for now, are the default ones (innodb_buffer_pool_size has 128MB, I will try to update to 70% of the node memory), except for the number of max_connections which is set to 15758.

The data inside database is bellow 1GB and 5 tables. SELECT peak is around 700 queries per seconds, INSERT peak is around 200 QPS and UPDATE 100 QPS.

Masters are receiving only writes (INSERT, UPDATE, DELETE) and replicas are receiving only reads (SELECT).

Below are two charts for memory usage (the top ones are masters, the ones at the bottom are replicas).

The master's memory does an interesting "pattern" every week, not sure why.

I've tried to understand this behavior, reading MariaDB Memory Allocation , but couldn't anything that could explain this.

The chart below has a span of 15 days.

David Magalhães (121 rep)

Feb 22, 2021, 01:57 PM • Last activity: May 18, 2025, 04:04 AM

0 votes

2 answers

320 views

MySql binlog file getting truncated/shrinked during replication

mysql replication master-slave-replication

There is a master-slave setup of a mysql db. The replication was going on fine until pos y when an error was thrown `Client requested master to start replication from position > file size`. I tried to restart replication from an earlier position but it failed. I checked for binlog file size and it s...

                                  There is a master-slave setup of a mysql db. The replication was going on fine until pos y when an error was thrown Client requested master to start replication from position > file size. 
I tried to restart replication from an earlier position but it failed. I checked for binlog file size and it seems the size is currently x (with x <<< y). However, I do have logs for replication from pos x to pos y. I don't have access or knowledge of any activity on the db. Didn't find any relevant article where binlog file was truncated to a smaller size. 

I want to understand what all happening on the db could shrink a particular binlog file? Does restart or failover could lead to such a scenario?
                                

toros (23 rep)

Nov 21, 2023, 09:38 AM • Last activity: May 10, 2025, 04:05 PM

-1 votes

1 answers

327 views

MySQL replication fails with permission error for unknown reason

mysql replication master-slave-replication master-master-replication

Since 36h I'm struggling to setup a simple Master Master replication with MySQL 8. None of the found solutions on the Internet works here and meanwhile I'm a bit desperate. Even the first Master -> Slave replication wont work. So I hope for your help. **How does the environment looks like:** - Maste...

                                  Since 36h I'm struggling to setup a simple Master  Master replication with MySQL 8. None of the found solutions on the Internet works here and meanwhile I'm a bit desperate. Even the first Master -> Slave replication wont work. So I hope for your help.

**How does the environment looks like:**

- Master and Slave are running on separated machines
- MySQL runs in a docker (the latest official MySQL image) on port 33060
- hosts are managed by/with Plesk Obsidian

**MySQL configuration on both machines**

	[mysqld]
	skip-host-cache
	skip-name-resolve
	datadir=/var/lib/mysql
	socket=/var/run/mysqld/mysqld.sock
	secure-file-priv=/var/lib/mysql-files
	user=mysql

	pid-file=/var/run/mysqld/mysqld.pid

	sql-mode=""

	server-id                = 1	// 1 = master host, 2 = slave host
	binlog_do_db             = dbsscm
	gtid_mode                = ON
	enforce-gtid-consistency = ON

	[client]
	socket=/var/run/mysqld/mysqld.sock
	!includedir /etc/mysql/conf.d/

**Master Setup**

	CREATE USER 'replicator'@'%' IDENTIFIED BY 'REPLIC-PW';
	GRANT ALL PRIVILEGES ON *.* TO replicator@'%';
	FLUSH PRIVILEGES;

**Test master access via:**

	mysql --host=MASTER-IP --port=33060 -u replicator --password='REPLIC-PW'

**Slave Setup**

	CREATE USER 'replicator'@'%' IDENTIFIED BY 'REPLIC-PW';
	GRANT ALL PRIVILEGES ON *.* TO replicator@'%';
	FLUSH PRIVILEGES;

**Test slave access via:**

	mysql --host=SLAVE-IP --port=33060 -u replicator --password='REPLIC-PW'

The master sql-db is accessible via mysql --host ... or telnet from the slave host.
The slave sql-db is accessible via mysql --host ... or telnet from the master host.

**Setup Master -> Slave**

On the slave host:

	STOP SLAVE;

	CHANGE MASTER TO
			MASTER_HOST = 'MASTER-IP',
			MASTER_PORT = 33060,
			MASTER_USER = 'replicator',
			MASTER_PASSWORD = 'REPLIC-PW',
			MASTER_AUTO_POSITION = 1;
			
	START SLAVE;
	SHOW SLAVE STATUS\G;

**Problem - Last_IO_Errno: 2003**

	MySQL [(none)]> SHOW SLAVE STATUS\G;
	*************************** 1. row ***************************
				Slave_IO_State: Connecting to source
					Master_Host: MASTER-IP
					Master_User: replicator
					Master_Port: 33060
					Connect_Retry: 60
				Master_Log_File:
			Read_Master_Log_Pos: 4
				Relay_Log_File: 8f18893b5155-relay-bin.000001
					Relay_Log_Pos: 4
			Relay_Master_Log_File:
				Slave_IO_Running: Connecting
				Slave_SQL_Running: Yes
				Replicate_Do_DB:
			Replicate_Ignore_DB:
			Replicate_Do_Table:
		Replicate_Ignore_Table:
		Replicate_Wild_Do_Table:
	Replicate_Wild_Ignore_Table:
					Last_Errno: 0
					Last_Error:
					Skip_Counter: 0
			Exec_Master_Log_Pos: 0
				Relay_Log_Space: 157
				Until_Condition: None
				Until_Log_File:
					Until_Log_Pos: 0
			Master_SSL_Allowed: No
			Master_SSL_CA_File:
			Master_SSL_CA_Path:
				Master_SSL_Cert:
				Master_SSL_Cipher:
				Master_SSL_Key:
			Seconds_Behind_Master: 0
	Master_SSL_Verify_Server_Cert: No
					Last_IO_Errno: 2003
					Last_IO_Error: error connecting to master 'replicator@MASTER-IP:33060' - retry-time: 60 retries: 10 message: Can't connect to MySQL server on 'MASTER-IP:33060' (110)
				Last_SQL_Errno: 0
				Last_SQL_Error:
	Replicate_Ignore_Server_Ids:
				Master_Server_Id: 0
					Master_UUID:
				Master_Info_File: mysql.slave_master_info
						SQL_Delay: 0
			SQL_Remaining_Delay: NULL
		Slave_SQL_Running_State: Replica has read all relay log; waiting for more updates
			Master_Retry_Count: 86400
					Master_Bind:
		Last_IO_Error_Timestamp: 221129 07:22:21
		Last_SQL_Error_Timestamp:
				Master_SSL_Crl:
			Master_SSL_Crlpath:
			Retrieved_Gtid_Set:
				Executed_Gtid_Set: 50e65cd5-6ccf-11ed-9fbf-0242ac110003:1-12
					Auto_Position: 1
			Replicate_Rewrite_DB:
					Channel_Name:
			Master_TLS_Version:
		Master_public_key_path:
			Get_master_public_key: 0
				Network_Namespace:
	1 row in set, 1 warning (0.000 sec)

**What I have tried to solve?**

- search Google and forums for more than 24h now
- remove/reset everything several times
- tried different user names and passwords for the "replicator"
- switched of the firewalls on Master & Slave hosts

Whatever I do - I always get the same Last_IO_Errno: 2003 error and I have no idea what else I can further do?
BTW: The databases on the Master are running and are accessible for client applications via the MASTER-IP.

P.S.: I'm aware that I've yesterday posted a similar post. But the post contains far more details.
                                

Lars (109 rep)

Nov 29, 2022, 07:44 AM • Last activity: Apr 27, 2025, 07:01 PM

1 votes

2 answers

1102 views

Split read/write requests over different read/write datasources

postgresql master-slave-replication split

Recently I have run some performance tests on the application I work on, and it turns out that it didn't do really well ( the problem is mainly between the back and the DB). So as investigating the problem\solution, we have found out that using read/write datasources ( read/write master 1 or multipl...

                                  Recently I have run some performance tests on the application I work on, and it turns out that it didn't do really well ( the problem is mainly between the back and the DB). So as investigating the problem\solution, we have found out that using read/write datasources ( read/write master 1 or multiple reads slaves) could be a good way to go. As I found in those sources: http://fedulov.website/2015/10/14/dynamic-datasource-routing-with-spring/ 

To sum up the solution consists of defining the datasources, and before each transaction ( @transaction ) define which datasource should we use. But with already having a huge number of defined services and transactions ( my case) it seems too much time consuming to choose at every step which datasource to use.

Is there an automated way to split (select vs /post/update ) opreations ? or a project that serves as a proxy to route the queries. ( I have seen this question that was asked 9 years ago but I think certainly there are new solutions How to setup Hibernate to read/write to different datasources?).

Also I have read about latency problems between writing and reading, ( are we talking about ms, s latency ?) does the number of read instances influence the latency? what to do to prevent such behavior before staring to code. ( an architecture to adopt maybe ? a design pattern? )

Ps: I am using spring, spring data jpa, hibernate, postgresql, hikari connection pool. Thank your for time.

ch.E (111 rep)

Aug 16, 2019, 09:48 AM • Last activity: Apr 21, 2025, 10:03 AM

0 votes

2 answers

482 views

Can MongoDB manage balancing between the Mongos Dynamically using TCP Balancer

mongodb master-slave-replication mongodb-3.4

If I have a business where I maintain 10 applications and I need to divide the request load of these 10 applications between 3 Mongos Dynamically. Like the TCP Loadbalancer need to connect to MongoDB config servers and taking the Health status of 3 Mongos constantly and distributing the request load...

                                  If I have a business where I maintain 10 applications and I need to divide the request load of these 10 applications between 3 Mongos Dynamically.

Like the TCP Loadbalancer need to connect to MongoDB config servers and taking the Health status of 3 Mongos constantly and distributing the request load from application depending on the health state(getting to whether a particular Mongos is busy or free).

Is anything like the above can be achieved using TCP Balancer? or Its just can be used to configure the load static.

Please, confirm.

Katakam Maniratnam (1 rep)

Sep 7, 2017, 02:25 PM • Last activity: Apr 16, 2025, 11:01 AM

2 votes

1 answers

3242 views

fatal error 1236 from master when reading data from binary log: 'Could not find first log file name in binary log index file'

mysql mariadb master-slave-replication mysqld-multi

I'm getting the following error; Got fatal error 1236 from master when reading data from binary log: 'Could not find first log file name in binary log index file' I've tried setting this up on 3 different machines now and initially, all seemed well, but when I went back to it, I had the same error,...

                                  I'm getting the following error;

    Got fatal error 1236 from master when reading data from binary log: 'Could not find first log file name in binary log index file'

I've tried setting this up on 3 different machines now and initially, all seemed well, but when I went back to it, I had the same error, so I experimented on 2 other machines, both giving me the same.

Below are my cnf files;

my.cnf
    #
    # This group is read both both by the client and the server
    # use it for options that affect everything
    #
    [client-server]
    
    #
    # include all files from the config directory
    #
    !includedir /etc/my.cnf.d
    !includedir /etc/mysql

my_master.cnf

    # MASTER MySQL database server configuration file created by master-slave-setup
    
    # this is read by the standalone daemon and embedded servers
    [server]
    
    [mysqld1]
    #
    # * Basic Settings
    #
    user		= mysql
    pid-file	= /var/run/mysqld/mysqld_master.pid
    socket		= /var/run/mysqld/mysqld_master.sock
    port		= 3306
    basedir		= /usr
    datadir		= /var/lib/mysql_master
    tmpdir		= /tmp
    lc-messages-dir	= /usr/share/mysql
    skip-external-locking
    
    bind-address		= 127.0.0.1
    
    #
    # * Fine Tuning
    #
    key_buffer_size		= 16M
    max_allowed_packet	= 16M
    thread_stack		= 192K
    thread_cache_size       = 8
    myisam_recover_options  = BACKUP
    
    query_cache_limit	= 1M
    query_cache_size        = 16M
    
    log_error = /var/log/mysql/error_master.log
    
    server-id		= 1
    log_bin			= /var/log/mysql/mysql-bin.log
    expire_logs_days	= 10
    max_binlog_size   = 100M
    binlog_do_db		= hn
    innodb_flush_log_at_trx_commit  = 1
    sync_binlog                 = 1
    binlog-format               = ROW
    
    character-set-server  = utf8mb4
    collation-server      = utf8mb4_general_ci

my_slave.cnf

    # SLAVE MySQL database server configuration file created by master-slave-setup
    
    # this is read by the standalone daemon and embedded servers
    [server]
    
    [mysqld2]
    #
    # * Basic Settings
    #
    user		= mysql
    pid-file	= /var/run/mysqld/mysqld_slave.pid
    socket		= /var/run/mysqld/mysqld_slave.sock
    port		= 3307
    basedir		= /usr
    datadir		= /var/lib/mysql_slave
    tmpdir		= /tmp
    lc-messages-dir	= /usr/share/mysql
    skip-external-locking
    
    bind-address		= 127.0.0.1
    
    #
    # * Fine Tuning
    #
    key_buffer_size		= 16M
    max_allowed_packet	= 16M
    thread_stack		= 192K
    thread_cache_size       = 8
    myisam_recover_options  = BACKUP
    
    query_cache_limit	= 1M
    query_cache_size        = 16M
    
    log_error = /var/log/mysql/error_slave.log
    server-id		= 2
    #log_bin			= /var/log/mysql/mysql-bin.log
    expire_logs_days	= 10
    max_binlog_size   = 100M
    binlog_do_db		= hn
    #binlog_ignore_db	= exclude_database_name
    relay-log		= /var/log/mysql/relay-bin
    #relay-log-index		= /var/log/mysql/relay-bin.index
    #master-info-file	= /var/log/mysql/master.info
    #relay-log-info-file	= /var/log/mysql/relay-log.info
    
    character-set-server  = utf8mb4
    collation-server      = utf8mb4_general_ci

my_multi.cnf

    # MYSQLD_MULTI MySQL database server configuration file created by master-slave-setup
    
    [mysqld_multi]
    mysqld = /usr/bin/mysqld_safe
    mysqladmin = /usr/bin/mysqladmin
    user = multi_admin
    password = multipass

Now I have tried changing the bingo_do_db to individual databases, as per the above and also left it commented out, but wasn't sure I wanted to be replicating the mysql database.

I'm using mysqld_multi to start and stop the instances and believe I have setup the users correctly.

I've added the following to my slave instance;

    CHANGE MASTER TO
    MASTER_HOST='127.0.0.1',
    MASTER_USER='replication',
    MASTER_PASSWORD='replication',
    MASTER_LOG_FILE='mysql-bin.000009',
    MASTER_LOG_POS=1652;

The show slave status\G is as follows;

    MariaDB [(none)]> show slave status\G;
    *************************** 1. row ***************************
                   Slave_IO_State: 
                      Master_Host: 127.0.0.1
                      Master_User: replication
                      Master_Port: 3306
                    Connect_Retry: 60
                  Master_Log_File: mysql-bin-000009
              Read_Master_Log_Pos: 1652
                   Relay_Log_File: relay-bin.000001
                    Relay_Log_Pos: 4
            Relay_Master_Log_File: mysql-bin-000009
                 Slave_IO_Running: No
                Slave_SQL_Running: Yes
                  Replicate_Do_DB: 
              Replicate_Ignore_DB: 
               Replicate_Do_Table: 
           Replicate_Ignore_Table: 
          Replicate_Wild_Do_Table: 
      Replicate_Wild_Ignore_Table: 
                       Last_Errno: 0
                       Last_Error: 
                     Skip_Counter: 0
              Exec_Master_Log_Pos: 1652
                  Relay_Log_Space: 248
                  Until_Condition: None
                   Until_Log_File: 
                    Until_Log_Pos: 0
               Master_SSL_Allowed: No
               Master_SSL_CA_File: 
               Master_SSL_CA_Path: 
                  Master_SSL_Cert: 
                Master_SSL_Cipher: 
                   Master_SSL_Key: 
            Seconds_Behind_Master: NULL
    Master_SSL_Verify_Server_Cert: No
                    Last_IO_Errno: 1236
                    Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Could not find first log file name in binary log index file'
                   Last_SQL_Errno: 0
                   Last_SQL_Error: 
      Replicate_Ignore_Server_Ids: 
                 Master_Server_Id: 1
                   Master_SSL_Crl: 
               Master_SSL_Crlpath: 
                       Using_Gtid: No
                      Gtid_IO_Pos: 
    1 row in set (0.00 sec)

UPDATE:- the below is a copy of the logs which may help;

    root@dl-sl-1:/var/log/mysql# dir
    error_master.log  mysql-bin.000005  mysql-bin.000011  mysql-bin.index
    error_slave.log   mysql-bin.000006  mysql-bin.000012  relay-bin.000004
    mysql-bin.000001  mysql-bin.000007  mysql-bin.000013  relay-bin.index
    mysql-bin.000002  mysql-bin.000008  mysql-bin.000014  replication.txt
    mysql-bin.000003  mysql-bin.000009  mysql-bin.000015
    mysql-bin.000004  mysql-bin.000010  mysql-bin.000016
    root@dl-sl-1:/var/log/mysql# cat mysql-bin.index
    /var/log/mysql/mysql-bin.000001
    /var/log/mysql/mysql-bin.000002
    /var/log/mysql/mysql-bin.000003
    /var/log/mysql/mysql-bin.000004
    /var/log/mysql/mysql-bin.000005
    /var/log/mysql/mysql-bin.000006
    /var/log/mysql/mysql-bin.000007
    /var/log/mysql/mysql-bin.000008
    /var/log/mysql/mysql-bin.000009
    /var/log/mysql/mysql-bin.000010
    /var/log/mysql/mysql-bin.000011
    /var/log/mysql/mysql-bin.000012
    /var/log/mysql/mysql-bin.000013
    /var/log/mysql/mysql-bin.000014
    /var/log/mysql/mysql-bin.000015
    /var/log/mysql/mysql-bin.000016

I'm also getting the below when I make changes with CHANGE MASTER TO...

    Last_SQL_Error: Error 'Operation CREATE USER failed for 'replication'@'%'' on query. Default database: ''. Query: 'CREATE USER 'replication'@'%' IDENTIFIED BY 'replication''

which I hadn't noticed before. This is pretty much on a clean install.
Any and all help would be greatly appreciated.
                                

plisken (23 rep)

Feb 8, 2021, 07:32 PM • Last activity: Apr 14, 2025, 09:05 AM

Showing page 1 of 20 total questions