Database Administrators

Q&A for database professionals who wish to improve their database skills

Latest Questions

1 votes

1 answers

1582 views

Innodb not started?

I installed MariaDB_Galera_server10.0 , but when i check the error log i see this: 170118 14:49:09 [Note] InnoDB: 128 rollback segment(s) are active. 170118 14:49:09 [Note] InnoDB: Waiting for purge to start 170118 14:49:09 [Note] InnoDB: Percona XtraDB (http://www.percona.com) 5.6.34-79.1 sta rted;...

                                  I installed MariaDB_Galera_server10.0 , but when i check the error log i see this:

   

    170118 14:49:09 [Note] InnoDB: 128 rollback segment(s) are active.
    170118 14:49:09 [Note] InnoDB: Waiting for purge to start
    170118 14:49:09 [Note] InnoDB: Percona XtraDB (http://www.percona.com)     5.6.34-79.1 sta
    rted; log sequence number 1627308
    170118 14:49:09 [Note] Plugin ‘FEEDBACK’ is disabled.
    170118 14:49:09 [Note] WSREP: Service disconnected.
    170118 14:49:10 [Note] WSREP: Some threads may fail to exit.
    170118 14:49:10 [Note] InnoDB: FTS optimize thread exiting.
    170118 14:49:10 [Note] InnoDB: Starting shutdown...
    170118 14:49:10 [Note] InnoDB: Waiting for page_cleaner to finish flushing   of buffer po
    170118 14:49:11 [Note] InnoDB: Shutdown completed; log sequence number   1627318

this is the cluster configuration '/etc/mysql/conf.d/cluster.cnf' :

    [mysqld]

    query_cache_size=0
    binlog_format=ROW
    default_storage_engine=innodb
    innodb_autoinc_lock_mode=2
    query_cache_type=0
    bind-address=0.0.0.0
    
    wsrep_on=ON
    wsrep_provider=/usr/Lib64/libgalera_smm.so
    wsrep_provider_options="gcache.size=32G"
    wsrep_cluster_name="test_cluster"
    wsrep_cluster_address=gcomm://192.168.10.231, 192.168.10.233
    wsrep_sst_method= rsync
    wsrep_sst_auth = wsrep_sst_user:wsrep_sst_pass
    wsrep_node_address='192.168.10.231'
    wsrep_node_name="yasoo"

and my.cnf :
 

    # MariaDB database server configuration file.
    # You can copy this file to one of:
    - “/etc/mysql/my.cnf’ to set global options,
    # - “—/.my.cnf” to set user-specific options.
    # One can use all, long options that the program supports.
    # Run program with - -help to get a list of available options and with
    # --print-defaults to see which it would actually understand and use.
    # For explanations see
    # http://dev  .mysql .com/doc/mysql/en/server-system-variables.html
    # This will be passed to all mysql clients
    # It has been reported that passwords should be enclosed with ticks/quotes
    # escpecially if they contain “#‘ chars...
    # Remember to edit /etc/mysql/debian.cnf when changing the socket location.
    
    [client]
    
    port = 3306
    socket = /var/run/mysqld/mysqld.sock
    
    # Here is entries for some specific programs
    # This was formally known as [safe_mysqid] . Both versions are currently parsed.
    
    [mysqld_safe]
    log-bin=/var/log/mysql-bin.log
    log=/var/tog/mysql.log
    #1og-error= /var/log/mysqld.error.log
    socket = /var/run/mysqld/mysqld.sock
    nice =0
    
    [mysqid]
    #* Basic Settings
    user = mysql
    pid - file = /var/run/mysqld/mysqld.pid
    socket = /var/run/mysqld/mysqld.sock
    port = 3306
    basedir = /usr
    datadir = /var/1ib/mysql
    tmpdir = ltmp
    lc_messages_dir = /usr/share/mysql
    lc_messages = en_US
    skip-external-locking
    
    # Instead of skip-networking the default is now to listen only on
    # localhost which is more compatible and is not less secure.
    #bind-address = 127.0.0.1
    #* Fine Tuning
    max_connections = 100
    connect_timeout = 5
    wait_timeout = 600
    max-allowed_packet = 16M
    thread_cache_size = 128
    sort_buffer_size = 4M
    bulk_insert_buffer_size = 16M
    tmp_table_size = 32M
    max_heap_table_size = 32M
    #* MyISAM
    # This replaces the startup script and checks MyISAM tables if needed
    # the first time they are touched. On error, make copy and try a repair.
    myisam_recover_options = BACKUP
    
    key_buffer_size = 128M
    #open-files-limit = 2000
    table_open_cache = 400
    myisam_sort_buffer_size = 512M
    concurrent_insert = 2
    read buffer size = 2M
    read md buffer size = 1M
    #* Query Cache Configuration
    # Cache only tiny result sets, so we can fit more in the query cache.
    query_cache_limit = 128K
    
    # for more write intensive setups, set to DEMAND or OFF
    #query_cache_type = DEMAND
    * Logging and Replication
    # Both location gets rotated by the cronjob.
    # Be aware that this log type is a performance killer.
    # As of 5.1 you can enable the log at runtime’
    general_log_file = /var/log/mysqi/mysql.log
    general_log = 1
    # Error logging goes to syslog due to /etc/mysql/conf.d/mysqld_safe_syslog.cnf.
    #1og-bin=/var/log/mysql-bin .log
    #1og=/var/log/mysqi.1og
    #]og-error= /var/log/mysqld.error.iog
    # we do want to know about network errors and such
    ]og_warnings = 2
    # Enable the slow query log to see queries with especially long duration
    #slow_query_log[={O 1})
    slow_query_log_file = /var/log/mysql/mariadb-slow.log
    long_query_time = 10
    
    log_slow_verbosity = query_plan
    #log -queries -not -using -indexes
    #log_slow_admin_statements
    The following can be used as easy to replay backup logs or for replication.
    # note: if you are setting up a replication sl.ave, see README.Debian about
    other settings you may need to change.
    server-id = 121
    #report_host = masterl
    #auto_inc rement_inc rement = 2
    #auto_increment_offset = 1
    log_bin = /var/log/mysql/mariadb-bin
    log_bin_index = /var/log/mysql/ma riadb-bin.index
    # not fab for performance, but safer
    #sync_binlog = 1
    expire_logs_days = 10
    max_binlog_size = lOOM
    # slaves
    relay_log = /var/log/mysql/relay-bin
    relay_log_index = /var/log/mysql/relay-bin.index
    relay_log_info_file = /var/log/mysql/relay-bin.info
    log_slave_updates
    
    
    # If applications support it, this stricter sql_mode prevents some
    # mistakes like inserting invalid dates etc.
    #sql_mode = NO_ENGINE_SUBSTITUTION,TRADITIONAL
    * InnoDB
    # InnoDB is enabled by default with a 10MB datafile in /var/lib/mysql/
    # Read the manual for more InnoDB related options. There are many!
    default_storage_engine = InnoDB
    # you can’t just change log file size, requires special procedure
    nnodb_log_file_size = SOM
    innodb_buffer_pool_size = 256M
    innodb_log_buffer_size = 8M
    innodb_file_per_table = 1
    innodb_open_files = 400
    innodb_io_capacity = 400
    innodb flush method = 0_DIRECT
    
    #* Security Features
    # Read the manu&L, too, if you want chroot!
    # chroot = /var/lJb/mysql/
    # For generating SSL certificates I recommend the OpenSSL GUI “tinyca”.
    # ssl. -ca=/etc/mysql/cace rt . pem
    # ssl-cert=/etc/mysql/server-cert .pem
    # ssl -key=/etc/mysql/server-key .pem
    
    #*Galera-related settings
    [galera]

    #Mandatory settings

    log-error=/var/log/mysql/mysql.err
    log -bin=/var/log/mysql/mysql-replication.og
    # Allow server to accept connections on all interfaces.
    #bind-address=G.O.O.O
    # Optional setting
    wsrep_slave_threads=16
    #innodb_flush_tog_at_trx_commit=0
    [mysqidump]
    quick
    quote-names
    max_allowed_packet = 16M
    [mysql]
    #no-auto-rehash # faster start of mysq1. but no tab completion
  
    [isamchkJ
    key_buffer = 16M
    
    *# IMPORTANT: Additional settings that can override those from this file!
    # The files must end with ‘.cnf’, otherwise they’ll be ignored.
    
    !includedir /etc/mysql/conf.d/

    
                                

Yaser Jawi (11 rep)

Jan 19, 2017, 10:01 AM • Last activity: May 24, 2025, 09:07 PM

0 votes

1 answers

470 views

MariaDB Galera Cluster (10.6.12) and Laravel WSREP issue

mariadb galera wsrep

I'm administrating a galera cluster with 3 nodes, serving a Laravel application and HAProxy is the gateway to the cluster. We had an interesting issue, and I'm unable to figure out why was there an error message in the Laravel application. Here is what I know: At 13:25:36 there was a note on Node1,...

                                  I'm administrating a galera cluster with 3 nodes, serving a Laravel application and HAProxy is the gateway to the cluster. We had an interesting issue, and I'm unable to figure out why was there an error message in the Laravel application. Here is what I know:
At 13:25:36 there was a note on Node1, that ... connection to peer f3e3XXXX-XXX with addr tcp://X.X.X.X:4567 timed out, no messages seen in PT3S, ... This IP represents Node2.
Roughly the same time 13:25:37, on Node2, the following message appeared in the logs ... WSREP: (f3e3XXXX-XXX, 'tcp://X.X.X.X:4567') turning message relay requesting on, nonlive peers: tcp://X.X.X.X:4567 ... This 2nd IP represent Node1. I'll post a more detailed log later in the post.

So my understanding is there could have been a network hickup between the clusters and Node2 (and 3) had to rejoin the cluster. Meanwhile, someone triggered the Larave application that made a query towards the cluster. But this request ended up in an error message saying: [2023-11-07 13:25:46] production.ERROR: SQLSTATE[08S01]: Communication link failure: 1047 WSREP has not yet prepared node for application use .....

If my understanding how a cluster should work is correct, I have no clue what happened. Isn't the galera cluster supposed to be able to manage if a node disappears from the cluster and use an existing node to handle the incoming query? Was this a HAProxy issue maybe and it could not detect that Node2 and 3 are unavailable and sent the query there eitherway?

This issue stayed for about 10 seconds and the cluster is working just fine ever since (I think... wsrep_cluster_size is 3 ATM so it must be OK).

Now I don't want to "*spam*" this post with all the logs, so I'll insert the logs from Node1 and 2 into pastebin and URLs are provided.

!!! Can anyone explain me what happened here in this situation exactly? Maybe a way to avoid this in the future? !!!

Thank you in advance !

**MariaDB logs:** https://pastebin.com/SpDbe7Bb 

**Laravel logs:**
`
[2023-11-07 13:25:46] production.ERROR: SQLSTATE[08S01]: Communication link failure: 1047 WSREP has not yet prepared node for application use (SQL: select * from .......) {"exception":"[object] (Illuminate\\Database\\QueryException(code: 08S01): SQLSTATE[08S01]: Communication link failure: 1047 WSREP has not yet prepared node for application use (SQL: select * from .......) at /var/www/html/myproject/releases/184/vendor/laravel/framework/src/Illuminate/Database/Connection.php:671)
`

Imre Bertalan (105 rep)

Nov 8, 2023, 11:09 AM • Last activity: Nov 8, 2023, 11:45 AM

1 votes

1 answers

2729 views

MariaDB Galera Cluster galera.cache file getting bigger than specified gcache.size

mariadb galera wsrep

We have a 3 node Galera Cluster running on Kubernetes, behind 2 HAProxy PODs configured so, all queries are executed on the first POD/node of the cluster if available, and the other 2 nodes, provide HA (HA Proxy backend backup nodes). In the config file, gcache.size is configured to 5 GB, and when a new node is deployed, galera.cache file is 5.1GB so, it seems to get that configuration correctly. However, what we are seeing is galera.cache growing in size up to 80 GBs or more for that first node of the cluster. As far as we know, this file should not increase in size. The problem is also reproduced when scaling the cluster down to one only node. It does not stop growing. The version deployed is 10.3.22 (10.3.22-debian-10-r1 Bitnami Docker image) These are the wsrep provider options specified in my.cnf:

wsrep_on=ON
wsrep_provider=/opt/bitnami/mariadb/lib/libgalera_smm.so
wsrep_provider_options="gcache.size=5G"
wsrep_sst_method=mariabackup
wsrep_slave_threads=4
wsrep_cluster_address=gcomm://
wsrep_cluster_name=galera
wsrep_sst_auth="root:"
innodb-flush-log-at-trx-commit=2
# MYISAM REPLICATION SUPPORT #
wsrep_replicate_myisam=ON

We've been dealing with this situation for some time now, we can remove the first node(POD) and then the galera.cache is recreated so, we free disk space. The first node syncs through IST with any of the other 2 nodes and the HAProxy points to a backup node meanwhile, and then back to the first node when recovered, with no downtime. However, we want to avoid to do this. We can't figure out why galera.cache file size increases, there is no documentation nor bug we could find talking about any similar issue. Any help will be much appreciated!

Alex Núñez (21 rep)

Jan 5, 2022, 04:19 PM • Last activity: Jan 21, 2022, 12:56 PM

0 votes

2 answers

475 views

How to Prevent the Connections for MariaDB Galera-Cluster When There is Only One Active Node?

mysql replication mariadb galera wsrep

I was using "rsync" as the wsrep_sst_method in my galera.cnf files on 3-Node Galera Cluster. When it was so, the system served when there was at least 2 active node. So when any two node were gone, the active one did not serve. It was good enough but when I upgraded the system and assigned the wsre_...

                                  I was using "rsync" as the wsrep_sst_method in my galera.cnf files on 3-Node Galera Cluster. When it was so, the system served when there was at least 2 active node. So when any two node were gone, the active one did not serve. It was good enough but when I upgraded the system and assigned the wsre_sst_method = mariabackup, it is serving if there is any active node. So when any of two nodes are gone, it is not preventing the working of the last active node. I dont want system works when there is only one active node. I want the system work like before upgrading without changing the wsrep_sst_method.

So the problem is simple. How can I prevent the system serving when there is just one active node?

TarabydaVllasıCafcaflıAtArabsı (117 rep)

Oct 13, 2021, 06:41 AM • Last activity: Oct 31, 2021, 04:08 PM

2 votes

0 answers

456 views

Why setting up new galera cluster with mariabackup as sst starts but all other nodes failed with same error?

clustering galera mariadb-10.3 wsrep

I did a fresh reinstallation of mariadb-server on all the nodes (I removed using `sudo apt purge mariadb-*`) I started the first node using sudo galera_new_cluster it went fine and is still running. but other nodes threw this error: ● mariadb.service - MariaDB 10.3.27 database serverLoaded: loaded (...

                                  I did a fresh reinstallation of mariadb-server on all the nodes 
(I removed using sudo apt purge mariadb-*)

I started the first node using sudo galera_new_cluster it went fine and is still running. but other nodes threw this error:

    ● mariadb.service - MariaDB 10.3.27 database serverLoaded: 
    loaded (/lib/systemd/system/mariadb.service; 
    enabled; vendor preset: enabled)
    Active: failed (Result: exit-code) since Sat 2020-12-19 20:23:19 IST; 2min 9s ago
    Docs: man:mysqld(8)
    Process: 7089 
    ExecStartPre=/usr/bin/install -m 755 -o mysql -g root -d /var/run/mysqld (code=exited, status=0/SUCCESS)Process: 7090 ExecStartPre=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)Process: 7092 ExecStartPre=/bin/sh -c [ ! -e /usr/bin/galera_recovery ] && VAR= || VAR=\cd /usr/bin/..; /usr/bin/galera_recovery\; [ $? -eq 0 ] && s`
        Process: 7330 ExecStart=/usr/sbin/mysqld $MYSQLD_OPTS $_WSREP_NEW_CLUSTER $_WSREP_START_POSITION (code=exited, status=1/FAILURE)Main PID: 7330 (code=exited, status=1/FAILURE)
        Status: "MariaDB server is down"
        Dec 19 20:22:53 phl-pi-3 systemd: Starting MariaDB 10.3.27 database server...
        Dec 19 20:22:59 phl-pi-3 sh: WSREP: Recovered position 00000000-0000-0000-0000-000000000000:-1
        Dec 19 20:22:59 phl-pi-3 mysqld: 2020-12-19 20:22:59 0 [Note] /usr/sbin/mysqld (mysqld 10.3.27-MariaDB-0+deb10u1-log) starting as process 7330 ...
        Dec 19 20:22:59 phl-pi-3 mysqld: 2020-12-19 20:22:59 0 [Warning] Could not increase number of max_open_files to more than 16384 (request: 32186)
        Dec 19 20:23:19 phl-pi-3 systemd: mariadb.service: Main process exited, code=exited, status=1/FAILURE
        Dec 19 20:23:19 phl-pi-3 systemd: mariadb.service: Failed with result 'exit-code'.
        Dec 19 20:23:19 phl-pi-3 systemd: Failed to start MariaDB 10.3.27 database server.`

this is my galera config:

    [mysqld]#mysql settings
    binlog_format=ROW
    default-storage-engine=innodb
    innodb_autoinc_lock_mode=2
    innodb_doublewrite=1
    query_cache_size=0query_cache_type=0
    bind-address=0.0.0.0
    #galera settings
    wsrep_on=ON
    wsrep_provider=/usr/lib/galera/libgalera_smm.so
    wsrep_cluster_name="test_cluster"
    wsrep_cluster_address=gcomm://192.168.0.15,192.168.0.16,192.168.0.12,10.8.0.6
    wsrep_node_address="192.168.0.15"
    wsrep_sst_method=mariabackup
    wsrep_sst_donor=192.168.0.16

all other nodes have same galera config except different wsrep_address and dont have wsrep_sst_donor set.

And the other server config is as below:

    $ cat 50-server.cnf
    # These groups are read by MariaDB server.
    # Use it for options that only the server (but not clients) should see
    ## See the examples of server my.cnf files in /usr/share/mysql/
    ## this is read by the standalone daemon and embedded servers
    [server]
    skip_name_resolve = 1
    # this is only for the mysqld standalone daemon
    [mysqld]
    transaction_isolation = READ-COMMITTED
    binlog_format = ROW
    innodb_large_prefix=on
    innodb_file_format=barracuda
    innodb_file_per_table=1
    innodb_io_capacity=4000
    # * Basic Settings
    user = mysqlpid-file = /var/run/mysqld/mysqld.pid
    socket = /var/run/mysqld/mysqld.sock
    port = 3306
    basedir = /usr
    datadir = /var/lib/mysql
    tmpdir = /tmp
    lc-messages-dir = /usr/share/mysql
    skip-external-locking
    bind-address = 0.0.0.0
    # * Fine Tuning
    key_buffer_size = 16
    Mmax_allowed_packet = 16M
    thread_stack = 192K
    thread_cache_size = 8
    myisam_recover_options = BACKUP
    #max_connections = 100
    #table_cache = 64
    #thread_concurrency = 10
    ## * Query Cache Configuration
    #query_cache_limit = 1M
    query_cache_type = 1
    query_cache_limit = 2M
    query_cache_min_res_unit = 2k
    query_cache_size = 64M
    ## * Logging and Replication
    ## Error log - should be very few entries.
    #log_error = /var/log/mysql/error.log
    server-id = 16
    log_bin = mariadb_bin
    expire_logs_days = 10
    max_binlog_size = 100M
    #binlog_do_db = include_database_name
    #binlog_ignore_db = exclude_database_name
    innodb_buffer_pool_size = 128M
    innodb_buffer_pool_instances = 1
    innodb_flush_log_at_trx_commit = 2
    innodb_log_buffer_size = 32M
    innodb_max_dirty_pages_pct = 90
    # For generating SSL certificates you can use for example the GUI tool "tinyca".
    ## ssl-ca=/etc/mysql/cacert.pem
    # ssl-cert=/etc/mysql/server-cert.pem
    # ssl-key=/etc/mysql/server-key.pem
    ## Accept only connections using the latest and most secure TLS protocol version.
    # ..when MariaDB is compiled with OpenSSL:
    # ssl-cipher=TLSv1.2
    # ..when MariaDB is compiled with YaSSL (default in Debian):
    # ssl=on
    ## * Character sets## MySQL/MariaDB default is Latin1, but in Debian we rather default to the full
    # utf8 4-byte character set. See also client.cnf
    #character-set-server = utf8mb4
    collation-server = utf8mb4_general_ci
    tmp_table_size= 64Mmax_heap_table_size= 64M
    slow_query_log = 1
    slow_query_log_file = /var/log/mysql/slow.log
    long_query_time = 1

all other nodes have same as above except different server-id

                                

Ciasto piekarz (139 rep)

Dec 19, 2020, 03:25 PM

0 votes

1 answers

243 views

Mariadb 10.1 / Galera - bootstrap node always doesn't lose quorum

mariadb galera mariadb-10.1 wsrep

I'm using mariadb10.1.37, wsrep_provider_version 25.3.24(r3825). On my dev cluster (2node+1arbitrator) I found the following behaviour: **node1:** bootstrap new cluster (using galera_new_cluster, with pc.weight=3) **node2:** join cluster (pc.weight=3) **arbitrator:** join cluster (pc.weight =1) So,...

                                  I'm using mariadb10.1.37, wsrep_provider_version 25.3.24(r3825). On my dev cluster (2node+1arbitrator) I found the following behaviour:

**node1:** bootstrap new cluster (using galera_new_cluster, with pc.weight=3)  
**node2:** join cluster (pc.weight=3)  
**arbitrator:** join cluster (pc.weight =1)  

So, I have a three node cluster, with total pc.weight of 7. Then I do:  

**arbitrator:** shutdown  garbd  
**node2:** shutdown mysqld

At this point, I expect node1, being only node out of three still alive, should have lost wsrep_cluster_status=Primary status, and no longer accept writes.  

Instead, I find wsrep_cluster_status = Primary, wsrep_cluster_size=1, and yet node1 will still accept writes. 

Is there something I am missing? 

Thanks in advance!

Mark S (132 rep)

Nov 7, 2018, 03:24 PM • Last activity: Nov 8, 2018, 03:56 PM

1 votes

1 answers

1253 views

WSREP: Writeset deserialization failed: Unsupported RecordSet version: 2: 71 (Protocol error)

mysql mariadb galera wsrep

We are running a Mariadb galera cluster with 3 data nodes. Recently one of them crashed and I had to do a SST (State Snapshot Transfer). Nothing I haven't done or seen before. However, after the SST was completed the process crashed on the following WSREP error: [ERROR] WSREP: Writeset deserializati...

                                  We are running a Mariadb galera cluster with 3 data nodes. Recently one of them crashed and I had to do a SST (State Snapshot Transfer). Nothing I haven't done or seen before. However, after the SST was completed the process crashed on the following WSREP error: 

    [ERROR] WSREP: Writeset deserialization failed: Unsupported RecordSet version: 2: 71 (Protocol error)
              at galerautils/src/gu_rset.cpp:header_version():272
              at galera/src/trx_handle.cpp:unserialize():268

complete logs:

    mysqld: 2018-10-24 15:03:41 139769088514304 [Note] InnoDB: 128 rollback segment(s) are active.
    mysqld: 2018-10-24 15:03:41 139769088514304 [Note] InnoDB: Waiting for purge to start
    mysqld: 2018-10-24 15:03:41 139769088514304 [Note] InnoDB:  Percona XtraDB (http://www.percona.com)  5.6.39-83.1 started; log sequence number 335729536918
    mysqld: 2018-10-24 15:03:41 139764732757760 [Note] InnoDB: Dumping buffer pool(s) not yet started
    mysqld: 2018-10-24 15:03:41 139769088514304 [Note] Plugin 'FEEDBACK' is disabled.
    mysqld: 2018-10-24 15:03:41 139769088514304 [Note] Recovering after a crash using /var/log/mysql/mariadb-bin
    mysqld: 2018-10-24 15:03:41 139769088514304 [Note] Starting crash recovery...
    mysqld: 2018-10-24 15:03:41 139769088514304 [Note] Crash recovery finished.
    mysqld: 2018-10-24 15:03:41 139769088514304 [Note] Server socket created on IP: '0.0.0.0'.
    mysqld: 2018-10-24 15:03:42 139769088514304 [Note] WSREP: Signalling provider to continue.
    mysqld: 2018-10-24 15:03:42 139769088514304 [Note] WSREP: SST received: 1b859078-cacc-11e8-8e3e-4381b13e7545:4661474
    mysqld: 2018-10-24 15:03:42 139769088514304 [Note] Reading of all Master_info entries succeded
    mysqld: 2018-10-24 15:03:42 139769088514304 [Note] Added new Master_info '' to hash table
    mysqld: 2018-10-24 15:03:42 139769088514304 [Note] /usr/sbin/mysqld: ready for connections.
    mysqld: Version: '10.1.36-MariaDB-1~xenial'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  mariadb.org binary distribution
    mysqld: 2018-10-24 15:03:42 139769088199424 [ERROR] WSREP: Writeset deserialization failed: Unsupported RecordSet version: 2: 71 (Protocol error)
    mysqld:          at galerautils/src/gu_rset.cpp:header_version():272
    mysqld:          at galera/src/trx_handle.cpp:unserialize():268
    mysqld: WS flags:      0
    mysqld: Trx proto:     3
    mysqld: Trx source:    00000000-0000-0000-0000-000000000000
    mysqld: Trx conn_id:   18446744073709551615
    mysqld: Trx trx_id:    18446744073709551615
    mysqld: Trx last_seen: -1
    mysqld: 2018-10-24 15:03:42 139769088199424 [ERROR] WSREP: Unsupported RecordSet version: 2: 71 (Protocol error)
    mysqld:          at galerautils/src/gu_rset.cpp:header_version():272
    mysqld:          at galera/src/trx_handle.cpp:unserialize():268
    mysqld: 2018-10-24 15:03:42 139769088199424 [Note] WSREP: applier thread exiting (code:7)
    mysqld: 2018-10-24 15:03:42 139769088199424 [ERROR] WSREP: node consistency compromised, aborting
    mysqld: 2018-10-24 15:03:42 139769088199424 [Note] WSREP: starting shutdown
    mysqld: 2018-10-24 15:03:42 139769010707200 [Note] /usr/sbin/mysqld: Normal shutdown
    mysqld: 2018-10-24 15:03:42 139769010707200 [Note] WSREP: Stop replication
    mysqld: 2018-10-24 15:03:42 139769010707200 [Note] WSREP: Closing send monitor...
    mysqld: 2018-10-24 15:03:42 139769010707200 [Note] WSREP: Closed send monitor.
    mysqld: 2018-10-24 15:03:42 139769010707200 [Note] WSREP: gcomm: terminating thread
    mysqld: 2018-10-24 15:03:42 139769010707200 [Note] WSREP: gcomm: joining thread
    mysqld: 2018-10-24 15:03:42 139769010707200 [Note] WSREP: gcomm: closing backend
    mysqld: 2018-10-24 15:03:42 139768797067008 [Note] WSREP: 3.0 (dbserver08): State transfer from 2.0 (DB) complete.
    mysqld: 2018-10-24 15:03:42 139768797067008 [Note] WSREP: Shifting JOINER -> JOINED (TO: 4666631)
    mysqld: 2018-10-24 15:03:42 139769010707200 [Note] WSREP: view(view_id(NON_PRIM,2b30f5eb,167) memb {
    mysqld:         7bffa222,0
    mysqld: } joined {
    mysqld: } left {
    mysqld: } partitioned {
    mysqld:         2b30f5eb,0
    mysqld:         30d0c86b,0
    mysqld:         5a7b95e7,0
    mysqld:         b0f6da74,0
    mysqld:         f7e81556,0
    mysqld: })
    mysqld: 2018-10-24 15:03:42 139769010707200 [Note] WSREP: view((empty))
    mysqld: 2018-10-24 15:03:42 139769010707200 [Note] WSREP: gcomm: closed
    mysqld: 2018-10-24 15:03:42 139768797067008 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
    mysqld: 2018-10-24 15:03:42 139768797067008 [Note] WSREP: Flow-control interval: [16, 16]
    mysqld: 2018-10-24 15:03:42 139768797067008 [Note] WSREP: Trying to continue unpaused monitor
    mysqld: 2018-10-24 15:03:42 139768797067008 [Note] WSREP: Received NON-PRIMARY.
    mysqld: 2018-10-24 15:03:42 139768797067008 [Note] WSREP: Shifting JOINED -> OPEN (TO: 4666631)
    mysqld: 2018-10-24 15:03:42 139768797067008 [Note] WSREP: Received self-leave message.
    mysqld: 2018-10-24 15:03:42 139768797067008 [Note] WSREP: Flow-control interval: [0, 0]
    mysqld: 2018-10-24 15:03:42 139768797067008 [Note] WSREP: Trying to continue unpaused monitor
    mysqld: 2018-10-24 15:03:42 139768797067008 [Note] WSREP: Received SELF-LEAVE. Closing connection.
    mysqld: 2018-10-24 15:03:42 139768797067008 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 4666631)
    mysqld: 2018-10-24 15:03:42 139768797067008 [Note] WSREP: RECV thread exiting 0: Success
    mysqld: 2018-10-24 15:03:42 139769010707200 [Note] WSREP: recv_thread() joined.
    mysqld: 2018-10-24 15:03:42 139769010707200 [Note] WSREP: Closing replication queue.
    mysqld: 2018-10-24 15:03:42 139769010707200 [Note] WSREP: Closing slave action queue.
    mysqld: 2018-10-24 15:03:44 139769088502528 [Note] WSREP: rollbacker thread exiting
    mysqld: 2018-10-24 15:03:44 139769010707200 [Note] Event Scheduler: Purging the queue. 0 events
    mysqld: 2018-10-24 15:03:44 139769010707200 [Note] WSREP: dtor state: JOINING
    mysqld: 2018-10-24 15:03:44 139769010707200 [Note] WSREP: apply mon: entered 0
    mysqld: 2018-10-24 15:03:44 139769010707200 [Note] WSREP: apply mon: entered 0
    mysqld: 2018-10-24 15:03:44 139769010707200 [Note] WSREP: mon: entered 1 oooe fraction 0 oool fraction 0
    mysqld: 2018-10-24 15:03:44 139769010707200 [Note] WSREP: cert index usage at exit 0
    mysqld: 2018-10-24 15:03:44 139769010707200 [Note] WSREP: cert trx map usage at exit 0
    mysqld: 2018-10-24 15:03:44 139769010707200 [Note] WSREP: deps set usage at exit 0
    mysqld: 2018-10-24 15:03:44 139769010707200 [Note] WSREP: avg deps dist 0
    mysqld: 2018-10-24 15:03:44 139769010707200 [Note] WSREP: avg cert interval 0
    mysqld: 2018-10-24 15:03:44 139769010707200 [Note] WSREP: cert index size 0
    mysqld: 2018-10-24 15:03:44 139768863397632 [Note] WSREP: Service thread queue flushed.
    mysqld: 2018-10-24 15:03:44 139769010707200 [Note] WSREP: wsdb trx map usage 0 conn query map usage 0
    mysqld: 2018-10-24 15:03:44 139769010707200 [Note] WSREP: MemPool(LocalTrxHandle): hit ratio: 0, misses: 0, in use: 0, in pool: 0
    mysqld: 2018-10-24 15:03:44 139769010707200 [Note] WSREP: MemPool(SlaveTrxHandle): hit ratio: 0, misses: 1, in use: 1, in pool: 0
    mysqld: 2018-10-24 15:03:44 139769010707200 [Warning] WSREP: Waiting for 5168 items to be fetched.

What does this WSREP error means?

I tried googling this specific error message but nothing came up.

I also checked mariadb versions differences between the nodes as I recently had to reinstall a node. But I couldn't spot any difference between them.
                                

Thomas Wiersema (111 rep)

Oct 24, 2018, 03:03 PM • Last activity: Oct 25, 2018, 05:19 AM

0 votes

2 answers

1735 views

wsrep_sst_xtrabackup socat starts and stops right away

mysql replication innodb galera wsrep

What I have done: I have restarted mysql on the JOINERNODE to apply some database settings including increasing the back_log and query_cache_size settings. What I am seeing: When I start mysql on the JOINERNODE, I see socat launch and listen on port 4444 then stop about 1-2 seconds later. Joiner MYS...

                                  What I have done:
I have restarted mysql on the JOINERNODE to apply some database settings including increasing the back_log and query_cache_size settings.

What I am seeing:
When I start mysql on the JOINERNODE, I see socat launch and listen on port 4444 then stop about 1-2 seconds later.

Joiner MYSQL logs:
    
    161107 19:16:43 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
    161107 19:16:43 mysqld_safe WSREP: Running position recovery with --log_error='/var/lib/mysql/wsrep_recovery.JKtTSU' --pid-file='/var/lib/mysql/JOINERNODE-recover.pid'
    161107 19:16:56 mysqld_safe WSREP: Recovered position 3bf24a64-e806-11e5-8238-ea129650fffe:4050608949
    161107 19:16:56 [Note] WSREP: wsrep_start_position var submitted: '3bf24a64-e806-11e5-8238-ea129650fffe:4050608949'
    161107 19:16:56 [Note] WSREP: Read nil XID from storage engines, skipping position init
    161107 19:16:56 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib64/galera/libgalera_smm.so'
    161107 19:16:56 [Note] WSREP: wsrep_load(): Galera 25.3.5(rXXXX) by Codership Oy  loaded successfully.
    161107 19:16:56 [Note] WSREP: CRC-32C: using hardware acceleration.
    161107 19:16:56 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1
    161107 19:16:56 [Note] WSREP: Passing config to GCS: base_host = 170.71.77.88; base_port = 4567; cert.log_conflicts = no; debug = no; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 1; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 1G; gcache.size = 4G; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc.ignore_quorum = false; pc.ignore_sb = false; pc.npvo = false; pc.version = 0; pc.wait_prim = true; pc.wait_prim_timeout = P30S; pc.weight = 1; protonet.ba
    161107 19:16:56 [Note] WSREP: Service thread queue flushed.
    161107 19:16:56 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1
    161107 19:16:56 [Note] WSREP: wsrep_sst_grab()
    161107 19:16:56 [Note] WSREP: Start replication
    161107 19:16:56 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1
    161107 19:16:56 [Note] WSREP: protonet asio version 0
    161107 19:16:56 [Note] WSREP: Using CRC-32C (optimized) for message checksums.
    161107 19:16:56 [Note] WSREP: backend: asio
    161107 19:16:56 [Note] WSREP: GMCast version 0
    161107 19:16:56 [Note] WSREP: (0a872720-a551-11e6-9bc2-eedf51200d4b, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
    161107 19:16:56 [Note] WSREP: (0a872720-a551-11e6-9bc2-eedf51200d4b, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
    161107 19:16:56 [Note] WSREP: EVS version 0
    161107 19:16:56 [Note] WSREP: PC version 0
    161107 19:16:56 [Note] WSREP: gcomm: connecting to group 'galera_cluster', peer 'DONORNODE:'
    161107 19:16:56 [Note] WSREP: declaring 4159cb50-6323-11e6-946f-f77a9bdd31e2 stable
    161107 19:16:56 [Note] WSREP: Node 4159cb50-6323-11e6-946f-f77a9bdd31e2 state prim
    161107 19:16:56 [Note] WSREP: view(view_id(PRIM,0a872720-a551-11e6-9bc2-eedf51200d4b,85) memb {
      0a872720-a551-11e6-9bc2-eedf51200d4b,0
      4159cb50-6323-11e6-946f-f77a9bdd31e2,0
    } joined {
    } left {
    } partitioned {
    })
    161107 19:16:57 [Note] WSREP: gcomm: connected
    161107 19:16:57 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
    161107 19:16:57 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
    161107 19:16:57 [Note] WSREP: Opened channel 'galera_cluster'
    161107 19:16:57 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 2
    161107 19:16:57 [Note] WSREP: Waiting for SST to complete.
    161107 19:16:57 [Note] WSREP: STATE_EXCHANGE: sent state UUID: 0ad4f230-a551-11e6-9d07-8244e22ea7fb
    161107 19:16:57 [Note] WSREP: STATE EXCHANGE: sent state msg: 0ad4f230-a551-11e6-9d07-8244e22ea7fb
    161107 19:16:57 [Note] WSREP: STATE EXCHANGE: got state msg: 0ad4f230-a551-11e6-9d07-8244e22ea7fb from 0 (JOINERNODE)
    161107 19:16:57 [Note] WSREP: STATE EXCHANGE: got state msg: 0ad4f230-a551-11e6-9d07-8244e22ea7fb from 1 (DONORNODE)
    161107 19:16:57 [Note] WSREP: Quorum results:
      version    = 3,
      component  = PRIMARY,
      conf_id    = 72,
      members    = 1/2 (joined/total),
      act_id     = 4053449525,
      last_appl. = -1,
      protocols  = 0/5/2 (gcs/repl/appl),
      group UUID = 3bf24a64-e806-11e5-8238-ea129650fffe
    161107 19:16:57 [Note] WSREP: Flow-control interval: [23, 23]
    161107 19:16:57 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 4053449525)
    161107 19:16:57 [Note] WSREP: State transfer required:
      Group state: 3bf24a64-e806-11e5-8238-ea129650fffe:4053449525
      Local state: 00000000-0000-0000-0000-000000000000:-1
    161107 19:16:57 [Note] WSREP: New cluster view: global state: 3bf24a64-e806-11e5-8238-ea129650fffe:4053449525, view# 73: Primary, number of nodes: 2, my index: 0, protocol version 2
    161107 19:16:57 [Warning] WSREP: Gap in state sequence. Need state transfer.
    161107 19:16:59 [Note] WSREP: Running: 'wsrep_sst_xtrabackup --role 'joiner' --address '170.71.77.88' --auth 'replication:replication' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --parent '18657''
    WSREP_SST: [INFO] Streaming with xbstream (20161107 19:16:59.521)
    WSREP_SST: [INFO] Using socat as streamer (20161107 19:16:59.524)
    WSREP_SST: [INFO] Evaluating socat -u TCP-LISTEN:4444,reuseaddr stdio | xbstream -x; RC=( ${PIPESTATUS[@]} ) (20161107 19:16:59.663)
    161107 19:17:02 [Note] WSREP: Prepared SST request: xtrabackup|170.71.77.88:4444/xtrabackup_sst
    161107 19:17:02 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
    161107 19:17:02 [Note] WSREP: REPL Protocols: 5 (3, 1)
    161107 19:17:02 [Note] WSREP: Service thread queue flushed.
    161107 19:17:02 [Note] WSREP: Assign initial position for certification: 4053449525, protocol version: 3
    161107 19:17:02 [Note] WSREP: Service thread queue flushed.
    161107 19:17:02 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (3bf24a64-e806-11e5-8238-ea129650fffe): 1 (Operation not permitted)
       at galera/src/replicator_str.cpp:prepare_for_IST():447. IST will be unavailable.
    161107 19:17:02 [Note] WSREP: Member 0.0 (JOINERNODE) requested state transfer from '*any*'. Selected 1.0 (DONORNODE)(SYNCED) as donor.
    161107 19:17:02 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 4053450570)
    161107 19:17:02 [Note] WSREP: Requesting state transfer: success, donor: 1
    xbstream: Can't create/write to file './ibdata1' (Errcode: 17 - File exists)
    xbstream: failed to create file.
    2016/11/07 19:17:38 socat E write(1, 0x84b3e0, 8192): Broken pipe
    WSREP_SST: [ERROR] Xbstream failed (20161107 19:17:38.164)
    WSREP_SST: [ERROR] Data directory /var/lib/mysql/ may not be empty: lp:1193240 Manual intervention required in that case (20161107 19:17:38.167)
    WSREP_SST: [ERROR] Cleanup after exit with status:32 (20161107 19:17:38.169)
    WSREP_SST: [INFO] Removing the sst_in_progress file (20161107 19:17:38.172)
    161107 19:17:38 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup --role 'joiner' --address '170.71.77.88' --auth 'replication:replication' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --parent '18657': 32 (Broken pipe)
    161107 19:17:38 [ERROR] WSREP: Failed to read uuid:seqno from joiner script.
    161107 19:17:38 [ERROR] WSREP: SST failed: 32 (Broken pipe)
    161107 19:17:38 [ERROR] Aborting

    161107 19:17:38 [Warning] WSREP: 1.0 (DONORNODE): State transfer to 0.0 (JOINERNODE) failed: -22 (Invalid argument)
    161107 19:17:38 [ERROR] WSREP: gcs/src/gcs_group.c:gcs_group_handle_join_msg():723: Will never receive state. Need to abort.
    161107 19:17:38 [Note] WSREP: gcomm: terminating thread
    161107 19:17:38 [Note] WSREP: gcomm: joining thread
    161107 19:17:38 [Note] WSREP: gcomm: closing backend
    161107 19:17:39 [Note] WSREP: view(view_id(NON_PRIM,0a872720-a551-11e6-9bc2-eedf51200d4b,85) memb {
      0a872720-a551-11e6-9bc2-eedf51200d4b,0
    } joined {
    } left {
    } partitioned {
      4159cb50-6323-11e6-946f-f77a9bdd31e2,0
    })
    161107 19:17:39 [Note] WSREP: view((empty))
    161107 19:17:39 [Note] WSREP: gcomm: closed
    161107 19:17:39 [Note] WSREP: /usr/sbin/mysqld: Terminated.
    161107 19:17:39 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended

Donor Mysql Logs:


    161107 19:16:56 [Note] WSREP: declaring 0a872720-a551-11e6-9bc2-eedf51200d4b stable
    161107 19:16:56 [Note] WSREP: Node 4159cb50-6323-11e6-946f-f77a9bdd31e2 state prim
    161107 19:16:56 [Note] WSREP: view(view_id(PRIM,0a872720-a551-11e6-9bc2-eedf51200d4b,85) memb {
      0a872720-a551-11e6-9bc2-eedf51200d4b,0
      4159cb50-6323-11e6-946f-f77a9bdd31e2,0
    } joined {
    } left {
    } partitioned {
    })
    161107 19:16:56 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 2
    161107 19:16:56 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
    161107 19:16:57 [Note] WSREP: STATE EXCHANGE: sent state msg: 0ad4f230-a551-11e6-9d07-8244e22ea7fb
    161107 19:16:57 [Note] WSREP: STATE EXCHANGE: got state msg: 0ad4f230-a551-11e6-9d07-8244e22ea7fb from 0 (JOINERNODE)
    161107 19:16:57 [Note] WSREP: STATE EXCHANGE: got state msg: 0ad4f230-a551-11e6-9d07-8244e22ea7fb from 1 (DONORNODE)
    161107 19:16:57 [Note] WSREP: Quorum results:
      version    = 3,
      component  = PRIMARY,
      conf_id    = 72,
      members    = 1/2 (joined/total),
      act_id     = 4053449525,
      last_appl. = 4053449492,
      protocols  = 0/5/2 (gcs/repl/appl),
      group UUID = 3bf24a64-e806-11e5-8238-ea129650fffe
    161107 19:16:57 [Note] WSREP: Flow-control interval: [23, 23]
    161107 19:16:57 [Note] WSREP: New cluster view: global state: 3bf24a64-e806-11e5-8238-ea129650fffe:4053449525, view# 73: Primary, number of nodes: 2, my index: 1, protocol version 2
    161107 19:16:57 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
    161107 19:16:57 [Note] WSREP: REPL Protocols: 5 (3, 1)
    161107 19:16:57 [Note] WSREP: Service thread queue flushed.
    161107 19:16:57 [Note] WSREP: Assign initial position for certification: 4053449525, protocol version: 3
    161107 19:16:57 [Note] WSREP: Service thread queue flushed.
    161107 19:16:57 [Warning] WSREP: Releasing seqno 4053449525 before 4053449526 was assigned.
    161107 19:17:02 [Note] WSREP: Member 0.0 (JOINERNODE) requested state transfer from '*any*'. Selected 1.0 (DONORNODE)(SYNCED) as donor.
    161107 19:17:02 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 4053450570)
    161107 19:17:02 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
    161107 19:17:02 [Note] WSREP: Running: 'wsrep_sst_xtrabackup --role 'donor' --address '170.71.77.88:4444/xtrabackup_sst' --auth 'replication:replication' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --gtid '3bf24a64-e806-11e5-8238-ea129650fffe:4053450570''
    161107 19:17:02 [Note] WSREP: sst_donor_thread signaled with 0
    WSREP_SST: [INFO] Streaming with xbstream (20161107 19:17:02.257)
    WSREP_SST: [INFO] Using socat as streamer (20161107 19:17:02.260)
    WSREP_SST: [INFO] Streaming the backup to joiner at 170.71.77.88 4444 (20161107 19:17:02.273)
    WSREP_SST: [INFO] Evaluating innobackupex --defaults-file=/etc/my.cnf $INNOEXTRA --galera-info --stream=$sfmt ${TMPDIR} 2>${DATA}/innobackup.backup.log | socat -u stdio TCP:170.71.77.88:4444; RC=( ${PIPESTATUS[@]} ) (20161107 19:17:02.277)
    2016/11/07 19:17:38 socat E write(3, 0x689200, 8192): Broken pipe
    WSREP_SST: [ERROR] innobackupex finished with error: 1.  Check /var/lib/mysql//innobackup.backup.log (20161107 19:17:38.181)
    WSREP_SST: [ERROR] Cleanup after exit with status:22 (20161107 19:17:38.185)
    161107 19:17:38 [ERROR] WSREP: Failed to read from: wsrep_sst_xtrabackup --role 'donor' --address '170.71.77.88:4444/xtrabackup_sst' --auth 'replication:replication' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --gtid '3bf24a64-e806-11e5-8238-ea129650fffe:4053450570'
    161107 19:17:38 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup --role 'donor' --address '170.71.77.88:4444/xtrabackup_sst' --auth 'replication:replication' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --gtid '3bf24a64-e806-11e5-8238-ea129650fffe:4053450570': 22 (Invalid argument)
    161107 19:17:38 [ERROR] WSREP: Command did not run: wsrep_sst_xtrabackup --role 'donor' --address '170.71.77.88:4444/xtrabackup_sst' --auth 'replication:replication' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --gtid '3bf24a64-e806-11e5-8238-ea129650fffe:4053450570'
    161107 19:17:38 [Warning] WSREP: 1.0 (DONORNODE): State transfer to 0.0 (JOINERNODE) failed: -22 (Invalid argument)
    161107 19:17:38 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 4053457328)
    161107 19:17:39 [Note] WSREP: Member 1.0 (DONORNODE) synced with group.
    161107 19:17:39 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 4053457328)
    161107 19:17:39 [Note] WSREP: Node 4159cb50-6323-11e6-946f-f77a9bdd31e2 state prim
    161107 19:17:39 [Note] WSREP: Synchronized with group, ready for connections
    161107 19:17:39 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
    161107 19:17:39 [Note] WSREP: view(view_id(PRIM,4159cb50-6323-11e6-946f-f77a9bdd31e2,86) memb {
      4159cb50-6323-11e6-946f-f77a9bdd31e2,0
    } joined {
    } left {
    } partitioned {
      0a872720-a551-11e6-9bc2-eedf51200d4b,0
    })
    161107 19:17:39 [Note] WSREP: forgetting 0a872720-a551-11e6-9bc2-eedf51200d4b (tcp://170.71.77.88:4567)
    161107 19:17:39 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 1
    161107 19:17:39 [Note] WSREP: STATE_EXCHANGE: sent state UUID: 23bcece3-a551-11e6-aef5-b26eb62eab01
    161107 19:17:39 [Note] WSREP: STATE EXCHANGE: sent state msg: 23bcece3-a551-11e6-aef5-b26eb62eab01
    161107 19:17:39 [Note] WSREP: STATE EXCHANGE: got state msg: 23bcece3-a551-11e6-aef5-b26eb62eab01 from 0 (DONORNODE)
    161107 19:17:39 [Note] WSREP: Quorum results:
      version    = 3,
      component  = PRIMARY,
      conf_id    = 73,
      members    = 1/1 (joined/total),
      act_id     = 4053457328,
      last_appl. = 4053457319,
      protocols  = 0/5/2 (gcs/repl/appl),
      group UUID = 3bf24a64-e806-11e5-8238-ea129650fffe
    161107 19:17:39 [Note] WSREP: Flow-control interval: [16, 16]
    161107 19:17:39 [Note] WSREP: New cluster view: global state: 3bf24a64-e806-11e5-8238-ea129650fffe:4053457328, view# 74: Primary, number of nodes: 1, my index: 0, protocol version 2
    161107 19:17:39 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
    161107 19:17:39 [Note] WSREP: REPL Protocols: 5 (3, 1)
    161107 19:17:39 [Note] WSREP: Service thread queue flushed.
    161107 19:17:39 [Note] WSREP: Assign initial position for certification: 4053457328, protocol version: 3
    161107 19:17:39 [Note] WSREP: Service thread queue flushed.
    161107 19:17:39 [Warning] WSREP: Releasing seqno 4053457328 before 4053457329 was assigned.
    161107 19:17:44 [Note] WSREP:  cleaning up 0a872720-a551-11e6-9bc2-eedf51200d4b (tcp://170.71.77.88:4567)

innobackup.backup.log from 

    161107 19:17:02 innobackupex: Starting the backup operation

    IMPORTANT: Please check that the backup run completes successfully.
               At the end of a successful backup run innobackupex
               prints "completed OK!".

    161107 19:17:02  version_check Connecting to MySQL server with DSN 'dbi:mysql:;mysql_read_default_group=xtrabackup;port=3306;mysql_socket=/var/lib/mysql/mysql.sock' as 'replication'  (using password: YES).
    161107 19:17:02  version_check Connected to MySQL server
    161107 19:17:02  version_check Executing a version check against the server...
    161107 19:17:03  version_check Done.
    161107 19:17:03 Connecting to MySQL server host: localhost, user: replication, password: set, port: 3306, socket: /var/lib/mysql/mysql.sock
    Using server version 5.5.38-MariaDB-wsrep-log
    innobackupex version 2.3.4 based on MySQL server 5.6.24 Linux (x86_64) (revision id: e80c779)
    xtrabackup: uses posix_fadvise().
    xtrabackup: cd to /var/lib/mysql
    xtrabackup: open files limit requested 0, set to 5005
    xtrabackup: using the following InnoDB configuration:
    xtrabackup:   innodb_data_home_dir = ./
    xtrabackup:   innodb_data_file_path = ibdata1:10M:autoextend
    xtrabackup:   innodb_log_group_home_dir = ./
    xtrabackup:   innodb_log_files_in_group = 2
    xtrabackup:   innodb_log_file_size = 2097152000
    xtrabackup: using O_DIRECT
    161107 19:17:35 >> log scanned up to (182177566706404)
    xtrabackup: Generating a list of tablespaces
    161107 19:17:35  Streaming ./ibdata1
    161107 19:17:36 >> log scanned up to (182177569423322)
    161107 19:17:37 >> log scanned up to (182177571865574)
    innobackupex: Error writing file 'UNOPENED' (Errcode: 32 - Broken pipe)
    xb_stream_write_data() failed.
    innobackupex: Error writing file 'UNOPENED' (Errcode: 32 - Broken pipe)
     xtrabackup: Error: xtrabackup_copy_datafile() failed.
     xtrabackup: Error: failed to copy datafile.

My.cnf settings:

    [client]
    port = 3306
    socket = /var/lib/mysql/mysql.sock

    [mysqld]
    back_log = 1000
    expire_logs_days = 1
    innodb_autoextend_increment = 64
    innodb_buffer_pool_instances = 64
    innodb_buffer_pool_size = 100G
    innodb_log_buffer_size = 128M
    innodb_thread_concurrency = 0
    thread_cache_size = 512
    server-id = 3
    port = 3306
    binlog_cache_size = 4M
    binlog-do-db = zabbix
    binlog_format = ROW
    binlog-row-event-max-size = 8192
    datadir = /var/lib/mysql
    innodb_concurrency_tickets = 5000
    innodb_flush_log_at_trx_commit = 0
    innodb_flush_method = O_DIRECT
    innodb_log_file_size = 2000M
    innodb_old_blocks_time = 1000
    innodb_stats_on_metadata = OFF
    ignore-db-dir = lost+found
    join_buffer_size = 1M
    log-bin = mysql-bin
    log-error = /var/log/mysql/mysqld.log
    max_allowed_packet = 64M
    max_connect_errors = 10000
    max_connections = 1000
    max_heap_table_size = 256M
    net_buffer_length = 8K
    pid-file = /var/run/mysqld/mysqld.pid
    query_cache_size = 64000000
    query_cache_type = 1
    read_buffer_size = 1M
    relay-log-recovery = 1
    relay-log-space-limit = 2G
    replicate-do-db = zabbix
    replicate-ignore-db = mysql, performance_schema, lost+found
    slave-skip-error = 1062
    socket = /var/lib/mysql/mysql.sock
    sort_buffer_size = 1M
    table_open_cache = 4096
    tmp_table_size = 1G
    wait_timeout = 28800
    key_buffer_size = 16M
    binlog-format = row
    innodb_flush_neighbor_pages = cont
    innodb_max_dirty_pages_pct = 30
    innodb_io_capacity = 6000
    log-slave-updates = true
    read_rnd_buffer_size = 16M
    relay-log-purge = 1
    thread_concurrency = 24
    tmpdir = /dev/shm

    innodb_file_per_table

    skip-slave-start

    [sst]
    streamfmt=xbstream

    [mysqldump]
    max_allowed_packet = 64M
    quick

    [mysql]
    no-auto-rehash

    [mysqlhotcopy]

    interactive-timeout

    [mysqld_safe]
    pid-file = /var/run/mysqld/mysqld.pid

    !includedir /etc/my.cnf.d/

    [mysqld]
    large-pages
    skip-external-locking

    [mysqld]
    large-pages
    skip-external-locking

    [mysqld]
    wsrep_provider = /usr/lib64/galera/libgalera_smm.so
    wsrep_provider_options = gcache.size=4G; gcache.page_size=1G
    wsrep_cluster_address = gcomm://cernzbxdb201.cernerasp.com
    wsrep_cluster_name = galera_cluster
    default_storage_engine = InnoDB
    innodb_autoinc_lock_mode = 2
    innodb_locks_unsafe_for_binlog = 1
    wsrep_sst_method = xtrabackup-v2
    wsrep_slave_threads = 64
    wsrep_sst_auth =replication:replication
                                

Som3guy (75 rep)

Nov 8, 2016, 01:42 AM • Last activity: Mar 9, 2018, 10:11 AM

1 votes

1 answers

8205 views

Should I increase my key_buffer_size?

mysql replication mariadb galera wsrep

I have 4 node and database has InnoDB tables. my key_buffer_size is 128M. Should i increase it in my system? My innodb_buffer_pool_size is 75G and innodb_log_buffer_size = 256M. Mem: 96688 92580 4107 0 116 8501 -/+ buffers/cache: 83962 12725 Swap: 10239 5104 5135 MariaDB [mydata]> SHOW STATUS LIKE "...

                                  I have 4 node and database has InnoDB tables. my key_buffer_size is 128M. Should i increase it in my system? My innodb_buffer_pool_size is  75G and innodb_log_buffer_size  = 256M.

    Mem:         96688      92580       4107          0        116       8501
    -/+ buffers/cache:      83962      12725
    Swap:        10239       5104       5135

    MariaDB [mydata]> SHOW STATUS LIKE "key%";
    +------------------------+--------+
    | Variable_name          | Value  |
    +------------------------+--------+
    | Key_blocks_not_flushed | 0      |
    | Key_blocks_unused      | 107171 |
    | Key_blocks_used        | 4      |
    | Key_blocks_warm        | 0      |
    | Key_read_requests      | 25     |
    | Key_reads              | 4      |
    | Key_write_requests     | 14     |
    | Key_writes             | 11     |
    +------------------------+--------+
    8 rows in set (0.00 sec)

I got that error, one of my node closed so I have that question. I use 5.5.46-MariaDB-1~trusty   wsrep_25.12.r4f8102 Thanks

    RECORD LOCKS space id 513 page no 16 n bits 296 index GEN_CLUST_INDEX of table mydata.user_counter trx id 65CFC8177 lock_mode X locks rec but not gap
    161230 19:08:36 [ERROR] mysqld got signal 6 ;
    This could be because you hit a bug. It is also possible that this binary
    or one of the libraries it was linked against is corrupt, improperly built,
    or misconfigured. This error can also be caused by malfunctioning hardware.
    
    To report this bug, see http://kb.askmonty.org/en/reporting-bugs 
    
    We will try our best to scrape up some info that will hopefully help
    diagnose the problem, but since we have already crashed, 
    something is definitely wrong and this may fail.
    
    Server version: 5.5.46-MariaDB-1~trusty-wsrep-log
    key_buffer_size=134217728
    read_buffer_size=2097152
    max_used_connections=809
    max_threads=2002
    thread_count=224
    It is possible that mysqld could use up to 
    key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 12467846 K  bytes of memory
    Hope that's ok; if not, decrease some variables in the equation.
    
    Thread pointer: 0x0x7feb728c6000
    Attempting backtrace. You can use the following information to find out
    where mysqld died. If you see no messages after this, something went
    terribly wrong...
    stack_bottom = 0x7ff378886a00 thread_stack 0x48000
    (my_addr_resolve failure: fork)
    /usr/sbin/mysqld(my_print_stacktrace+0x2e) [0x7ff37c2db1ae]
    /usr/sbin/mysqld(handle_fatal_signal+0x457) [0x7ff37bebffc7]
    /lib/x86_64-linux-gnu/libpthread.so.0(+0x10340) [0x7ff37a90e340]
    /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x39) [0x7ff379f65cc9]
    /lib/x86_64-linux-gnu/libc.so.6(abort+0x148) [0x7ff379f690d8]
    /usr/sbin/mysqld(+0x2ded4e) [0x7ff37bcafd4e]
    /usr/sbin/mysqld(+0x835033) [0x7ff37c206033]
    /usr/sbin/mysqld(+0x83b811) [0x7ff37c20c811]
    /usr/sbin/mysqld(+0x83c35b) [0x7ff37c20d35b]
    /usr/sbin/mysqld(+0x75a2b2) [0x7ff37c12b2b2]
    /usr/sbin/mysqld(+0x75e7ad) [0x7ff37c12f7ad]
    /usr/sbin/mysqld(+0x729202) [0x7ff37c0fa202]
    /usr/sbin/mysqld(Rows_log_event::find_row(Relay_log_info const*)+0x665) [0x7ff37bfa0a45]
    /usr/sbin/mysqld(Update_rows_log_event::do_exec_row(Relay_log_info const*)+0x9c) [0x7ff37bfa0e8c]
    /usr/sbin/mysqld(Rows_log_event::do_apply_event(Relay_log_info const*)+0x25c) [0x7ff37bf944ac]
    /usr/sbin/mysqld(wsrep_apply_cb(void*, void const*, unsigned long, unsigned int, wsrep_trx_meta const*)+0x7ba) [0x7ff37be70afa]
    /usr/lib/galera/libgalera_smm.so(galera::TrxHandle::apply(void*, wsrep_cb_status (*)(void*, void const*, unsigned long, unsigned int, wsrep_trx_meta const*), wsrep_trx_meta const&) const+0xd8) [0x7ff377b188f8]
    /usr/lib/galera/libgalera_smm.so(+0x1df27d) [0x7ff377b4f27d]
    /usr/lib/galera/libgalera_smm.so(galera::ReplicatorSMM::apply_trx(void*, galera::TrxHandle*)+0xd2) [0x7ff377b51b32]
    /usr/lib/galera/libgalera_smm.so(galera::ReplicatorSMM::process_trx(void*, galera::TrxHandle*)+0x10e) [0x7ff377b5498e]
    /usr/lib/galera/libgalera_smm.so(galera::GcsActionSource::dispatch(void*, gcs_action const&, bool&)+0x1b8) [0x7ff377b33668]
    /usr/lib/galera/libgalera_smm.so(galera::GcsActionSource::process(void*, bool&)+0x58) [0x7ff377b33ef8]
    /usr/lib/galera/libgalera_smm.so(galera::ReplicatorSMM::async_recv(void*)+0x73) [0x7ff377b54ef3]
    /usr/lib/galera/libgalera_smm.so(galera_recv+0x18) [0x7ff377b634e8]
    /usr/sbin/mysqld(+0x4a0744) [0x7ff37be71744]
    /usr/sbin/mysqld(start_wsrep_THD+0x48e) [0x7ff37bcccc0e]
    /lib/x86_64-linux-gnu/libpthread.so.0(+0x8182) [0x7ff37a906182]
    /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7ff37a02947d]
    
    Trying to get some variables.
    Some pointers may be invalid and cause the dump to abort.
    Query (0x0): is an invalid pointer
    Connection ID (thread ID): 10
    Status: NOT_KILLED
    
    Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=off
    
    The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html  contains
    information that should help you find out what is causing the crash.
    161230 19:08:40 mysqld_safe Number of processes running now: 0
    161230 19:08:40 mysqld_safe WSREP: not restarting wsrep node automatically
    161230 19:08:40 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended


                                

Melih (284 rep)

Jan 6, 2017, 08:03 AM • Last activity: Jan 6, 2017, 11:37 AM

1 votes

0 answers

440 views

Galera log has WSREP: failed to initialize io-cache

galera wsrep

I'm having trouble tracking down the cause of an error message that is appearing in my logs every 15 minutes or so. The cause of the error happened whilst performance testing a 3 node Galera Cluster (running on MariaDB 10.1.13). Two of the read-only nodes fell too far behind the write master causing...

                                  I'm having trouble tracking down the cause of an error message that is appearing in my logs every 15 minutes or so.

The cause of the error happened whilst performance testing a 3 node Galera Cluster (running on MariaDB 10.1.13). Two of the read-only nodes fell too far behind the write master causing them to request an SST sync and pulling the cluster off-line whilst the write master fulfilled the sync.

Since that point though the master has regularly been posting the following message into the logs:

    2016-08-26 18:03:16 139975496911616 [ERROR] WSREP: failed to initialize io-cache
    2016-08-26 18:03:16 139975496911616 [Warning] WSREP: binlog trx cache not empty (0 bytes) @ connection close 6510
    2016-08-26 18:03:16 139975496911616 [Warning] WSREP: binlog stmt cache not empty (0 bytes) @ connection close 6510

Google is sparse on any details and the only good hit I've had is from the source code , which appears to suggest that it's unable to initialise a cache for reading or writing (hmmm, much like the error message suggests).

       32   if (reinit_io_cache(cache, READ_CACHE, 0, 0, 0))
       33   {
       34     WSREP_ERROR("failed to initialize io-cache");
       35     return ER_ERROR_ON_WRITE;
       36   }

This message continues despite a reboot and full SST sync from the other nodes, so I'm a little lost as to the cause. Does anyone know what this means and how to fix it? 
                                

Dan (111 rep)

Aug 26, 2016, 07:22 PM

Showing page 1 of 10 total questions