Postgres Replication - WAL sender throw connection reset by peer

1 vote
0 answers
1988 views
                          Good morning everyone, 

I hope containment is going well for you.
For my part, I use this time to set up a Postgres 11 cluster on my servers, but I have some problems. Can you help me? 

I've got this error coming in a loop on the postres master.

    PostgreSQL Database directory appears to contain a database; Skipping initialization
    
    2020-05-06 08:40:21.379 UTC  LOG:  listening on IPv4 address "0.0.0.0", port 5432
    2020-05-06 08:40:21.379 UTC  LOG:  listening on IPv6 address "::", port 5432
    2020-05-06 08:40:21.380 UTC  LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
    2020-05-06 08:40:21.390 UTC  LOG:  database system was shut down at 2020-05-06 08:40:19 UTC
    2020-05-06 08:40:21.390 UTC  LOG:  entering standby mode
    2020-05-06 08:40:21.391 UTC  LOG:  consistent recovery state reached at 0/165A760
    2020-05-06 08:40:21.391 UTC  LOG:  invalid record length at 0/165A760: wanted 24, got 0
    2020-05-06 08:40:21.391 UTC  LOG:  trigger file found: /var/lib/postgresql/data/promote_to_master.tmp
    2020-05-06 08:40:21.391 UTC  LOG:  redo is not required
    2020-05-06 08:40:21.392 UTC  LOG:  database system is ready to accept read only connections
    2020-05-06 08:40:21.394 UTC  LOG:  selected new timeline ID: 2
    2020-05-06 08:40:21.416 UTC  LOG:  archive recovery complete
    2020-05-06 08:40:21.421 UTC  LOG:  database system is ready to accept connections
    2020-05-06 08:40:29.133 UTC  ERROR:  cannot execute SQL commands in WAL sender for physical replication
    2020-05-06 08:40:29.134 UTC  LOG:  could not receive data from client: Connection reset by peer
    2020-05-06 08:40:34.275 UTC  ERROR:  cannot execute SQL commands in WAL sender for physical replication
    2020-05-06 08:40:34.275 UTC  LOG:  could not receive data from client: Connection reset by peer
    2020-05-06 08:40:53.105 UTC  ERROR:  cannot execute SQL commands in WAL sender for physical replication
    2020-05-06 08:40:53.106 UTC  LOG:  could not receive data from client: Connection reset by peer
    2020-05-06 08:41:24.186 UTC  ERROR:  cannot execute SQL commands in WAL sender for physical replication
    2020-05-06 08:41:24.186 UTC  LOG:  could not receive data from client: Connection reset by peer
    2020-05-06 08:42:19.287 UTC  ERROR:  cannot execute SQL commands in WAL sender for physical replication
    2020-05-06 08:42:19.287 UTC  LOG:  could not receive data from client: Connection reset by peer
    2020-05-06 08:43:49.475 UTC  ERROR:  cannot execute SQL commands in WAL sender for physical replication
    2020-05-06 08:43:49.475 UTC  LOG:  could not receive data from client: Connection reset by peer


The client postgres display this one during pg_basebackup : 

    pg_basebackup: could not send replication command "SHOW data_directory_mode": FATAL:  Backend throw an error message
    DETAIL:  Exiting current session because of an error from backend
    HINT:  BACKEND Error: "cannot execute SQL commands in WAL sender for physical replication"
    server closed the connection unexpectedly
    	This probably means the server terminated abnormally
    	before or while processing the request.

My cluster is set like this: 

    postgres-0 -> 192.168.9.227
    postgres-1 -> 192.168.187.162

Virtual IP for the Master : 

    db -> 10.105.49.122

I use a pgpool instance on each postgres host that listens on port 5433 and is connected to a backend on port 5432.

My configuration is done with these commands: 

    su postgres -c initdb
    host_ip=$(getent ahostsv4 $HOSTNAME | cut -d ' ' -f1 | head -n 1)
    echo "host    replication     ${REPLICA_USER}     ${host_ip}/32  md5" >> /var/lib/postgresql/data/pg_hba.conf
    REPLICATE_FROM="$(getent ahostsv4 "${MASTER_HOSTNAME}" | cut -d ' ' -f1 | head -n 1)"
    echo "${postgres_conf}" | envsubst >> /var/lib/postgresql/data/postgresql.conf
    echo "${recovery_conf}" | envsubst > /var/lib/postgresql/data/recovery.conf
    
    su postgres -c postgres &
    echo "CREATE USER ${REPLICA_USER} WITH REPLICATION LOGIN ENCRYPTED PASSWORD '${REPLICA_PASSWORD}';" | psql -U postgres

The variable $postgres_conf is equal to : 

    listen_addresses = '*'
    wal_level = hot_standby
    max_wal_senders = 8
    hot_standby = on
    archive_mode = on
    archive_command = 'cp /var/lib/postgresql/data/%p /var/lib/postgresql/data/archive/%f'

The variable $recovery_conf is equal to : 

    standby_mode = on
    recovery_target_timeline = 'latest'
    primary_conninfo = 'host=${REPLICATE_FROM} port=5432 user=${REPLICA_USER} password=${REPLICA_PASSWORD}'
    trigger_file = '/var/lib/postgresql/data/promote_to_master.tmp'

On the Master I do: 

    touch /var/lib/postgresql/data/promote_to_master.tmp

And on the Slave I go: 

    PGPASSWORD=${REPLICA_PASSWORD} su postgres -c "pg_basebackup -h ${REPLICATE_FROM} -D /var/lib/postgresql/data -U ${REPLICA_USER} -v -P"

Did anyone has the solution ?


                        
Asked by Shiishii (11 rep)
May 6, 2020, 09:20 AM
Last activity: Sep 24, 2021, 12:05 PM
Postgres Replication - WAL sender throw connection reset by peer

Related Questions