Database Administrators
Q&A for database professionals who wish to improve their database skills
Latest Questions
0
votes
0
answers
50
views
Postgres pg_wal size increasing possibly from patroni outage
I've recently hit an issue of pg_wal files not being cleaned up on a replica postgres instance server despite these being archived on the master instance. Patroni was out during the time due to an ETCD outage, after correcting this and Patroni reestablishing connections the wal files began to be cle...
I've recently hit an issue of pg_wal files not being cleaned up on a replica postgres instance server despite these being archived on the master instance. Patroni was out during the time due to an ETCD outage, after correcting this and Patroni reestablishing connections the wal files began to be cleaned up on the read only instance. During this time I saw no replication lag during this time. The second replica was also not affected in the same way. All 3 servers had ETCD being out.
System
* Postgres 16
* Patroni 4.0.4
* 3 nodes in the cluster 1 leader, 2 read only replicas
* cluster is managed by patroni
* Replication is done by physical replication
Question
From my understanding of Patroni the outage on Patroni shouldn't prevent the WAL files on the replication instance being removed. Am I missing something here in terms of the active responsibilities of Patroni?
I ensured that the read only replica was still in standby mode and it was operating as such during that time so wasn't that it was promoted during the outage.
From that time Patroni was repeatedly failing on
Jul 23 05:02:24 patroni: 2025-07-23 05:02:24,141 ERROR: watchprefix failed: ProtocolError("Connection broken: InvalidChunkLength(got length b'', 0 bytes read)", InvalidChunkLength(got length b'', 0 bytes rea
Jul 23 05:02:25 patroni: Traceback (most recent call last):
...
Jul 23 05:02:25 patroni: File "/usr/lib/python3/dist-packages/patroni/dcs/etcd.py", line 262, in _do_http_request Jul 23 05:02:25 patroni: raise etcd.EtcdConnectionFailed('No more machines in the cluster')
Jul 23 05:02:25 patroni: etcd.EtcdConnectionFailed: No more machines in the cluster
Jul 23 05:02:25 systemd: patroni.service: Main process exited, code=exited, status=1/FAILURE
Jul 23 05:02:25 systemd: patroni.service: Failed with result 'exit-code'.
Jul 23 05:02:25 systemd: patroni.service: Unit process 407680 (postgres) remains running after unit stopped.
Jul 23 05:02:25 systemd: patroni.service: Unit process 407683 (postgres) remains running after unit stopped.
...
From this, Patroni is crashing whilst the postgres process remains up (The proc for postgres server is 407680)
During this time I had also looked at pg_stat_activity and hadn't seen any long running queries that I'd expect to hold the WAL back.
In terms of postgres logs, the only things of note were when the pg_wal started to increase this is somewhat confirmed in the logs
2025-07-22 06:46:02.235 UTC LOG: restartpoint complete: wrote 517224 buffers (6.2%); 0 WAL file(s) added, 246 removed, 23 recycled; write=66.028 s, sync=0.499 s, total=66.721 s; sync files=686, longest=0.109 s, average=0.001 s; distance=4407526 kB, estimate=4407526 kB; lsn=21204/6CAA5F0, redo lsn=21203/13048F80
2025-07-22 06:47:27.708 UTC LOG: restartpoint complete: wrote 523662 buffers (6.2%); 0 WAL file(s) added, 1 removed, 22 recycled; write=77.386 s, sync=0.263 s, total=77.676 s; sync files=676, longest=0.012 s, average=0.001 s; distance=4407428 kB, estimate=4407516 kB; lsn=21205/159E6E88, redo lsn=21204/2006A040
2025-07-22 06:49:06.802 UTC LOG: restartpoint complete: wrote 562491 buffers (6.7%); 1 WAL file(s) added, 0 removed, 0 recycled; write=89.970 s, sync=0.256 s, total=90.321 s; sync files=598, longest=0.019 s, average=0.001 s; distance=4407504 kB, estimate=4407515 kB; lsn=21206/2207D1D0, redo lsn=21205/2D09E288
Where the WAL file(s) removed goes to 0
and then on recovery postgres receives a SIGHUP which causes it to reload config
2025-07-23 08:16:53.955 UTC LOG: received SIGHUP, reloading configuration files
Iamterribleatcoding
(1 rep)
Jul 23, 2025, 11:37 AM
• Last activity: Jul 23, 2025, 09:49 PM
0
votes
1
answers
224
views
On which machine we should install WAL-G?
We are using PostgreSQL version 12.1. Currently we have high availability and automatic failover using Patroni and it's working fine. Now we are planning for disaster recovery and we are planning to use WAL-G for this. We read some article to know that on which machine we should install WAL-G master...
We are using PostgreSQL version 12.1. Currently we have high availability and automatic failover using Patroni and it's working fine.
Now we are planning for disaster recovery and we are planning to use WAL-G for this.
We read some article to know that on which machine we should install WAL-G master, slave or both? But couldn't get a clear idea.
Can you please help us to know on which machine should we set up WAL-G?
Tekchand Dagar
(101 rep)
Mar 5, 2020, 10:49 AM
• Last activity: Jun 7, 2025, 02:00 PM
1
votes
1
answers
98
views
Is this Patroni topology safe?
My organization is planning to implement a high-availability PostgreSQL cluster using Patroni and etcd. However, we only have two data center sites available, which makes deploying a standard 3-node etcd cluster across separate failure domains challenging. We understand that running only 2-node etcd...
My organization is planning to implement a high-availability PostgreSQL cluster using Patroni and etcd. However, we only have two data center sites available, which makes deploying a standard 3-node etcd cluster across separate failure domains challenging.
We understand that running only 2-node etcd cluster increases the risk of split-brain or unavailability if one site becomes unreachable, due to the lack of a quorum.
to address this, we come with the following topology:
DC (Primary Site):
- 192.168.30.80: PostgreSQL node running Patroni (initial master)
- 192.168.30.83: etcd node
DRC (Disaster Recovery Site):
- 192.168.30.81: PostgreSQL node running Patroni (replica)
- 192.168.30.82: backup etcd node
each site runs a single-node etcd cluster, we have tested that failover still works in this setup, we use etcd mirror maker feature to continuously relay key creates and updates to a separate cluster in the DRC. We then use keepalived to manage a floating IP between the etcd clusters, which is used by Patroni on both nodes to access etcd.
My questions are:
- What are the risks are involved in running this kind of setup?
- Would it be better to add a lightweight third etcd node in separate site (e.g., the cloud) to form a proper quorum?

Ilham Syamsuddin
(35 rep)
May 21, 2025, 09:53 AM
• Last activity: May 26, 2025, 01:33 PM
0
votes
0
answers
123
views
etcd is out of space, getting NOSPACE alarm
We have two node patroni postgres databases on preprod environment. Today I tried to do patronictl edit-config and saw etcd was out of space. I looked at /var/lib/etcd/member directory and it was around 2GB. Because of this I can't edit postgresql.conf file but also manage patroni cluster. I tried e...
We have two node patroni postgres databases on preprod environment. Today I tried to do patronictl edit-config and saw etcd was out of space. I looked at /var/lib/etcd/member directory and it was around 2GB. Because of this I can't edit postgresql.conf file but also manage patroni cluster.
I tried etcdctl compact and defrag commands but they did nothing. Etcd database size remained the same. I'd like to decrease size of etcd with correct methods so I'd really appreciate your help.
Aras Etiya
(1 rep)
Dec 26, 2024, 12:59 PM
1
votes
0
answers
78
views
How to prevent PostgreSQL logging Patroni health check connections?
I have PostgreSQL 15.2 from the Percona repo, along with Patroni 3.0.1 as management tool, HAProxy 2.5.11 for proxying and PGBouncer 1.18.0 as server side pool and keepalived 2.1. for the virtual IP. I have 3 nodes, I will use these IPs for the sake of this post: 10.0.1.1 is the primary, node1 10.0....
I have PostgreSQL 15.2 from the Percona repo, along with Patroni 3.0.1 as management tool, HAProxy 2.5.11 for proxying and PGBouncer 1.18.0 as server side pool and keepalived 2.1. for the virtual IP.
I have 3 nodes, I will use these IPs for the sake of this post:
10.0.1.1 is the primary, node1
10.0.1.2 is node2
10.0.1.3 is node3
HAProxy is set up to do a health check every 3 seconds:
I didn't find any way to hide this in the PostgreSQL log. Any idea? It is unnecessary and misleading log message.
listen primary
bind *:9999
option httpchk /primary
http-check expect status 200
default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
server node1 10.1.0.1:6432 check port 8008
server node2 10.1.0.2:6432 check port 8008
server node3 10.1.0.3:6432 check port 8008
6432 is the PGBouncer port which will actually receive the SQL traffic, but health check is done over http on port 8008 with Patroni.
This causes Patroni to each time make a TCP ping to each PostgreSQL node, resulting in the following very ugly PostgreSQL logs every 3 seconds on every node:
2024-11-26 20:45:04.200 CET - Process=, Application=[[unknown]], User=[[unknown]]@[10.0.1.1(36024)]/[[unknown]], SessionID=[67462540.4d387], Session start=[2024-11-26 20:45:04 CET], Command tag=[]: LOG: could not receive data from client: Connection reset by peer
2024-11-26 20:45:05.152 CET - Process=, Application=[[unknown]], User=[[unknown]]@[10.0.1.2(48138)]/[[unknown]], SessionID=[67462541.4d391], Session start=[2024-11-26 20:45:05 CET], Command tag=[]: LOG: could not receive data from client: Connection reset by peer
2024-11-26 20:45:06.202 CET - Process=, Application=[[unknown]], User=[[unknown]]@[10.0.1.3(51212)]/[[unknown]], SessionID=[67462542.4d39d], Session start=[2024-11-26 20:45:06 CET], Command tag=[]: LOG: could not receive data from client: Connection reset by peer
On a network level it is a simple 3-way TCP handshake, closed immediately:

Gábor Major
(163 rep)
Nov 26, 2024, 09:01 PM
• Last activity: Nov 27, 2024, 05:29 PM
1
votes
2
answers
368
views
Postgres Hot standby stuck starting waiting for contrrecord despite wal file being present
We are experimenting with [patroni Multi Datacenter HA](https://patroni.readthedocs.io/en/latest/ha_multi_dc.html) with postgres16 using replication slots. At high level for building the remote cluster from scratch , we are by running base backup, while active is experiencing writes . We use replica...
We are experimenting with [patroni Multi Datacenter HA](https://patroni.readthedocs.io/en/latest/ha_multi_dc.html) with postgres16 using replication slots. At high level for building the remote cluster from scratch , we are by running base backup, while active is experiencing writes .
We use replication slot for hot standby via patroni . Here is example of active cluster config
postgresql:
parameters:
archive_command: /bin/true
archive_mode: 'on'
checkpoint_completion_target: '0.9'
checkpoint_timeout: 300
default_statistics_target: 100
effective_cache_size: 12GB
effective_io_concurrency: 2
fsync: 'on'
full_page_writes: 'on'
hot_standby: true
hot_standby_feedback: true
idle_in_transaction_session_timeout: 10min
log_min_duration_statement: 10000
log_min_error_statement: ERROR
log_min_messages: WARNING
log_temp_files: 4000
maintenance_work_mem: 3GB
max_connections: 512
max_parallel_maintenance_workers: 4
max_parallel_workers: 10
max_parallel_workers_per_gather: 2
max_prepared_transactions: 1024
max_replication_slots: 10
max_slot_wal_keep_size: 20GB
max_wal_senders: 10
max_wal_size: 2GB
max_worker_processes: 10
min_wal_size: 512MB
random_page_cost: 2
shared_buffers: 8GB
synchronous_commit: remote_write
track_functions: all
track_io_timing: 'on'
wal_buffers: 16MB
wal_compression: 'on'
wal_keep_segments: 128
wal_level: replica
work_mem: 16MB
use_pg_rewind: true
use_slots: true
retry_timeout: 14
What we see in experiments is, once PG is started post basebackup via patroni, it moves to patroni leader/running state , then we apply standby cluster config
curl -X PATCH http://localhost:8008/config \
-H "Content-Type: application/json" \
-d '{
"standby_cluster": {
"host": "standby.example.com",
"port": 5432,
"primary_slot_name": "standby_slot"
}
}' \
--data-urlencode "force=true"
This results in pg rewind , followed by cluster moving to Standby leader / streaming state most of the time. But at other times, it moves to Standby leader/ starting state at the end of rewind .What we see in pg logs is something similar to below
2024-08-19 20:02:00 UTC:::@::LOG: contrecord is requested by 93/4C000028
2024-08-19 20:02:00 UTC:::@::LOG: waiting for WAL to become available at 93/4C000040
Further analyzing lsn in replication slot indicates this record is generated post base backup time,was archived by leader just after base backup. In standby cluster , the wal file associated with lsn is present in pg_wal dir , but database is stuck on starting up , it never goes to streaming state and current state of replication slot is not active. Why wal file not being sent to hot standby despite being present on active cluster , to enable it move it to streaming state ?
Ganesh
(121 rep)
Aug 19, 2024, 09:48 PM
• Last activity: Sep 7, 2024, 04:12 AM
0
votes
1
answers
4487
views
Patroni with etcd starting up
I have an existing PostgreSQL cluster v11.5 with streaming replication. I want to set up high availability managed by Patroni. I installed Patroni on both machines, I set up an etcd cluster on three others Virtuals Machines ['http://10.A.A.191:2379', 'http://10.B.B.99:2379', 'http://10.C.C.170:2379'...
I have an existing PostgreSQL cluster v11.5 with streaming replication.
I want to set up high availability managed by Patroni.
I installed Patroni on both machines, I set up an etcd cluster on three others Virtuals Machines ['http://10.A.A.191:2379 ', 'http://10.B.B.99:2379 ', 'http://10.C.C.170:2379 '] but Patroni starting failed with systemctl.
When I start Patroni with
patroni -c /etc/patroni.yml
command, I have the errors below:
-12-26 16:51:38,207 DEBUG: New etcd client created for http://127.0.0.1:4001
2019-12-26 16:51:38,207 DEBUG: Selection next machine in cache. Available machines: ['http://10.A.A.191:2379 ', 'http://10.B.B.99:2379 ', 'http://10.C.C.170:2379 ']
2019-12-26 16:51:38,207 INFO: Selected new etcd server http://10.C.C.170:2379
2019-12-26 16:51:38,208 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
2019-12-26 16:51:38,208 DEBUG: Starting new HTTP connection (1): 10.C.C.170:2379
2019-12-26 16:51:38,210 DEBUG: http://10.C.C.170:2379 "GET /v2/machines HTTP/1.1" 404 19
2019-12-26 16:51:38,211 ERROR: Failed to get list of machines from http://10.C.C.170:2379/v2 : EtcdException('Bad response : 404 page not found\n',)
2019-12-26 16:51:38,211 INFO: Retrying on http://10.A.A.191:2379
2019-12-26 16:51:38,211 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
2019-12-26 16:51:38,211 DEBUG: Starting new HTTP connection (1): 10.237.132.191:2379
2019-12-26 16:51:38,214 DEBUG: http://10.A.A.191:2379 "GET /v2/machines HTTP/1.1" 404 19
2019-12-26 16:51:38,214 ERROR: Failed to get list of machines from http://10.A.A.191:2379/v2 : EtcdException('Bad response : 404 page not found\n',)
2019-12-26 16:51:38,214 INFO: Retrying on http://10.B.B.99:2379
2019-12-26 16:51:38,215 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
2019-12-26 16:51:38,215 DEBUG: Starting new HTTP connection (1): 10.B.B.99:2379
2019-12-26 16:51:38,217 DEBUG: http://10.B.B.99:2379 "GET /v2/machines HTTP/1.1" 404 19
2019-12-26 16:51:38,217 ERROR: Failed to get list of machines from http://10.B.B.99:2379/v2 : EtcdException('Bad response : 404 page not found\n',)
2019-12-26 16:51:38,217 INFO: waiting on etcd
The problem seems to be at API level DEBUG: http://10.B.B.99:2379 "GET /v2/machines HTTP/1.1" 404 19
etcd is reacheable from PostgreSQL host:
curl -s http://10.B.B.99:2380/members |jq -r
returns
[
{
"id": 730XXXXXXXXXXX000,
"peerURLs": [
"http://10.B.B.99:2380 "
],
"name": "etcd1",
"clientURLs": [
"http://0.0.0.0:2379 "
]
},
{
"id": 956XXXXXXXXXXX000,
"peerURLs": [
"http://10.C.C.170:2380 "
],
"name": "etcd3",
"clientURLs": [
"http://0.0.0.0:2379 "
]
},
{
"id": 1551XXXXXXXXXXX000,
"peerURLs": [
"http://10.A.A.191:2380 "
],
"name": "etcd2",
"clientURLs": [
"http://0.0.0.0:2379 "
]
}
]
***etcd1 start command:***
ExecStart=/usr/local/bin/etcd \
--name etcd1 \
--advertise-client-urls http://0.0.0.0:2379 \
--listen-client-urls http://0.0.0.0:2379 \
--initial-advertise-peer-urls http://10.B.B.99:2380 \
--listen-peer-urls http://0.0.0.0:2380 \
--initial-cluster etcd1=http://10.B.B.99:2380,etcd2=http://10.A.A.191:2380,etcd3=http://10.C.C.170:2380 \
--initial-cluster-token etcd-cluster-atom \
--initial-cluster-state new \
--logger zap \
--log-outputs stdout \
--log-level debug
Someone could help me please?
What is wrong?
Many thanks
**Versions:**
etcd Version: _3.4.0_
etcdctl version: _3.4.0_
API version: _3.4_
patroni _1.6.3_
python-etcd==_0.4.5_
Mika
(361 rep)
Dec 26, 2019, 04:29 PM
• Last activity: May 18, 2024, 04:52 PM
1
votes
1
answers
4224
views
Is it good practice to use both PGBouncer and PGPool together?
We have setup a streaming replication on Postgres 13 and it all works fine. Current setup is as follows. We have one Primary Postgres and one Secondary Postgres connected via streaming replication. This replication and cluster management of failover is managed by Patroni. We have PGBouncer for conne...
We have setup a streaming replication on Postgres 13 and it all works fine. Current setup is as follows.
We have one Primary Postgres and one Secondary Postgres connected via streaming replication. This replication and cluster management of failover is managed by Patroni. We have PGBouncer for connection pooling and it is currently connected to PG Master. This works fine without issues.
We wanted to use the secondary postgres for read purposes to reduce the read load on Primary. So we wanted to introduce PGPool in the eco system to get benefit of read/write splits. Couple of articles suggested that we could use both PGBouncer and PgPool together to get better results from both tools (https://scalegrid.io/blog/postgresql-connection-pooling-part-4-pgbouncer-vs-pgpool/) . So now our setup adds PgPool-II between PgBouncer and PG with connection pooling disabled.
This setup also works fine. However I am not able to understand about connection pooling. I could see many connections in pgpool when i do SHOW POOL_POOLS;
- Who is holding on to the connections for connection pooling ? PgBouncer/PgPool ?
- Why I see connections when I issue command show pool_pools in pgpool when connection pooling is disabled.
- Is this setup right and scalable? What configurations I need to change to get better of both tools ?


Pramod
(111 rep)
Oct 26, 2021, 10:19 AM
• Last activity: Apr 30, 2024, 09:16 AM
0
votes
0
answers
960
views
Patroni failed to get list of machines from etcd
I am running a patroni cluster (3.4) on linux with an etcd cluster. Normally the cluster runs perfectly fine but sometimes I get some errors saying Request to etcd server failed (ReadtmeoutError, NewConnectionError, ConnectTimeoutError) ETCD: 10.100.10.4 10.100.11.3 10.100.11.5 Patroni/PostgreSQL No...
I am running a patroni cluster (3.4) on linux with an etcd cluster. Normally the cluster runs perfectly fine but sometimes I get some errors saying Request to etcd server failed (ReadtmeoutError, NewConnectionError, ConnectTimeoutError)
ETCD:
10.100.10.4
10.100.11.3
10.100.11.5
Patroni/PostgreSQL Nodes
10.100.10.10
10.100.11.6
2024-04-21 04:45:42,868 DEBUG: Writing {"conn_url":"postgres://10.100.10.10:5432/postgres","api_url":"http://10.100.10.10:8008/patroni ","state":"running","role":"master","version":"3.3.0","xlog_location":1642347112,"timeline":5} to key /db/mycluster/members/postgresql0 ttl=30 dir=False append=False
2024-04-21 04:45:42,869 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
2024-04-21 04:45:42,871 DEBUG: http://10.100.11.3:2379 "PUT /v2/keys/db/mycluster/members/postgresql0 HTTP/1.1" 200 790
2024-04-21 04:45:42,871 INFO: no action. I am (postgresql0), the leader with the lock
2024-04-21 04:45:46,136 DEBUG: Issuing read for key /db/mycluster/ with args {'recursive': True, 'quorum': False, 'retry': }
2024-04-21 04:45:46,136 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
2024-04-21 04:45:46,138 DEBUG: http://10.100.11.3:2379 "GET /v2/keys/db/mycluster/?recursive=true&quorum=false HTTP/1.1" 200 None
2024-04-21 04:45:46,139 DEBUG: API thread: 10.100.11.6 - - "GET /cluster HTTP/1.1" 200 - latency: 3.354 ms
2024-04-21 04:45:51,981 DEBUG: Issuing read for key /db/mycluster/ with args {'recursive': True, 'quorum': False, 'retry': }
2024-04-21 04:45:51,983 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
2024-04-21 04:45:51,987 DEBUG: http://10.100.11.3:2379 "GET /v2/keys/db/mycluster/?recursive=true&quorum=false HTTP/1.1" 200 None
2024-04-21 04:45:51,989 DEBUG: API thread: 10.100.11.6 - - "GET /cluster HTTP/1.1" 200 - latency: 16.522 ms
2024-04-21 04:45:52,859 DEBUG: Issuing read for key /db/mycluster/ with args {'recursive': True, 'quorum': False, 'retry': }
2024-04-21 04:45:52,861 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
2024-04-21 04:45:56,198 ERROR: Request to server http://10.100.11.3:2379 failed: MaxRetryError('HTTPConnectionPool(host=\'10.100.11.3\', port=2379): Max retries exceeded with url: /v2/keys/db/mycluster/?recursive=true&quorum=false (Caused by ReadTimeoutError("HTTPConnectionPool(host=\'10.100.11.3\', port=2379): Read timed out. (read timeout=3.332937417338447)"))')
2024-04-21 04:45:56,198 INFO: Reconnection allowed, looking for another server.
2024-04-21 04:45:56,198 INFO: Retrying on http://10.100.10.4:2379
2024-04-21 04:45:56,199 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
2024-04-21 04:45:56,199 DEBUG: Starting new HTTP connection (1): 10.100.10.4:2379
2024-04-21 04:45:56,200 ERROR: Request to server http://10.100.10.4:2379 failed: MaxRetryError("HTTPConnectionPool(host='10.100.10.4', port=2379): Max retries exceeded with url: /v2/keys/db/mycluster/?recursive=true&quorum=false (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 111] Connection refused'))")
2024-04-21 04:45:56,200 INFO: Reconnection allowed, looking for another server.
2024-04-21 04:45:56,200 INFO: Retrying on http://10.100.11.5:2379
2024-04-21 04:45:56,200 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
2024-04-21 04:45:56,200 DEBUG: Starting new HTTP connection (1): 10.100.11.5:2379
2024-04-21 04:45:57,870 ERROR: Request to server http://10.100.11.5:2379 failed: MaxRetryError("HTTPConnectionPool(host='10.100.11.5', port=2379): Max retries exceeded with url: /v2/keys/db/mycluster/?recursive=true&quorum=false (Caused by ConnectTimeoutError(, 'Connection to 10.100.11.5 timed out. (connect timeout=1.6666666666666667)'))")
2024-04-21 04:45:57,870 INFO: Reconnection allowed, looking for another server.
2024-04-21 04:45:57,871 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
2024-04-21 04:45:57,871 DEBUG: Starting new HTTP connection (1): 10.100.11.5:2379
2024-04-21 04:45:59,540 ERROR: Failed to get list of machines from http://10.100.11.5:2379/v2 : MaxRetryError("HTTPConnectionPool(host='10.100.11.5', port=2379): Max retries exceeded with url: /v2/machines (Caused by ConnectTimeoutError(, 'Connection to 10.100.11.5 timed out. (connect timeout=1.6666666666666667)'))")
2024-04-21 04:45:59,541 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
2024-04-21 04:45:59,541 DEBUG: Starting new HTTP connection (1): 10.100.11.3:2379
2024-04-21 04:46:01,210 ERROR: Failed to get list of machines from http://10.100.11.3:2379/v2 : MaxRetryError("HTTPConnectionPool(host='10.100.11.3', port=2379): Max retries exceeded with url: /v2/machines (Caused by ConnectTimeoutError(, 'Connection to 10.100.11.3 timed out. (connect timeout=1.6666666666666667)'))")
2024-04-21 04:46:01,211 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
2024-04-21 04:46:01,211 DEBUG: Starting new HTTP connection (1): 10.100.10.4:2379
2024-04-21 04:46:01,212 ERROR: Failed to get list of machines from http://10.100.10.4:2379/v2 : MaxRetryError("HTTPConnectionPool(host='10.100.10.4', port=2379): Max retries exceeded with url: /v2/machines (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 111] Connection refused'))")
2024-04-21 04:46:01,212 DEBUG: Failed to update list of etcd nodes: EtcdException('Could not get the list of servers, maybe you provided the wrong host(s) to connect to?')
2024-04-21 04:46:01,484 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
2024-04-21 04:46:01,484 DEBUG: Starting new HTTP connection (1): 10.100.11.3:2379
2024-04-21 04:46:02,486 ERROR: Request to server http://10.100.11.3:2379 failed: MaxRetryError("HTTPConnectionPool(host='10.100.11.3', port=2379): Max retries exceeded with url: /v2/keys/db/mycluster/?recursive=true&quorum=false (Caused by ConnectTimeoutError(, 'Connection to 10.100.11.3 timed out. (connect timeout=1.0)'))")
2024-04-21 04:46:02,486 INFO: Reconnection allowed, looking for another server.
Firewall should not be a problem, but maybe timeouts?
This error only appear on one node (10.100.10.10)
If you need more information, please let me know!
Thank you!
mymarcelsql
(21 rep)
Apr 21, 2024, 05:23 AM
• Last activity: Apr 22, 2024, 09:20 AM
0
votes
0
answers
211
views
Pgbackrest connection to Patroni cluster with HAProxy (VIP)
I have Patroni cluster with: - 1 VIP - 3 nodes HAProxy - 3 nodes PostgreSQL Patroni - 3 nodes etcd And dedicated pgbackrest server, from which I want to connect to Patroni cluster and take backups. But I don't want to connect to physical servers, because I've configured backup standby, and so after...
I have Patroni cluster with:
- 1 VIP
- 3 nodes HAProxy
- 3 nodes PostgreSQL Patroni
- 3 nodes etcd
And dedicated pgbackrest server, from which I want to connect to Patroni cluster and take backups. But I don't want to connect to physical servers, because I've configured backup standby, and so after switchover or failovers, I must to change pgbackrest.conf file each time.
Is it acceptable to connect through HAProxy VIP to Patroni cluster from pgbackrest.conf?
pgbackrest.conf:
[global]
#repo1-path=/var/lib/pgbackrest
repo1-host-user=postgres
repo1-host-type=ssh
log-level-console=debug
log-level-file=detail
compress=n
start-fast=y
stop-auto=y
[pg_test]
repo1-path=/var/lib/pgbackrest/pg_test
pg1-host=VIP
pg1-host-user=pgbackrest
pg1-path=/opt/postgres/data/
pg1-port=5000
pg2-host=VIP
pg2-host-user=pgbackrest
pg2-path=/opt/postgres/data/
pg2-port=5001
backup-standby=y
archive-async=y
process-max=2
repo1-retention-full=2
My patroni parameters:
parameters:
archive_command: pgbackrest --stanza=pg_test archive-push %p
archive_mode: true
autovacuum_analyze_scale_factor: 0.05
autovacuum_vacuum_scale_factor: 0.05
checkpoint_completion_target: 0.9
checkpoint_timeout: 15 min
effective_cache_size: 11 GB
effective_io_concurrency: 100
huge_pages: false
maintenance_work_mem: 320 MB
max_connections: 1000;
max_parallel_maintenance_workers: 4
max_parallel_workers: 8
max_parallel_workers_per_gather: 4
max_wal_senders: 10
max_wal_size: 1024 MB
max_worker_processes: 8
min_wal_size: 512 MB
parallel_leader_participation: true
random_page_cost: 1.25
shared_buffers: 4096 MB
work_mem: 32 MB
And this is error I'm getting when creating stanza:
-------------------PROCESS START-------------------
2024-03-18 08:41:49.679 P00 INFO: stanza-create command begin 2.50: --exec-id=1598083-c6b0e543 --log-level-console=info --log-level-file=detail --pg1-host=VIP_adress --pg2-host=VIP_adress --pg1-host-user=pgbackrest --pg2-host-user=pgbackrest --pg1-path=/opt/postgres/data/ --pg2-path=/opt/postgres/data/ --pg1-port=5000 --pg2-port=5001 --repo1-host-type=ssh --repo1-path=/var/lib/pgbackrest/pg_test --stanza=pg_test
2024-03-18 08:41:50.055 P00 WARN: unable to check pg1: [OptionRequiredError] remote-0 process on 'VIP_adress' terminated unexpectedly : ERROR: : stanza-create command requires option: pg2-path
HINT: does this stanza exist?
2024-03-18 08:41:50.292 P00 WARN: unable to check pg2: [OptionRequiredError] remote-0 process on 'VIP_adress' terminated unexpectedly : ERROR: : stanza-create command requires option: pg2-path
HINT: does this stanza exist?
2024-03-18 08:41:50.292 P00 ERROR: : unable to find primary cluster - cannot proceed
HINT: are all available clusters in recovery?
2024-03-18 08:41:50.292 P00 INFO: stanza-create command end: aborted with exception
Farid Jafarzadeh
(1 rep)
Mar 18, 2024, 10:51 AM
• Last activity: Mar 22, 2024, 06:58 AM
0
votes
1
answers
171
views
High Availability postgres db cluster with Patroni and Etcd
I am building a HA Postgres db cluster with one master and multiple slaves. Please look at the below points and correct me if I am wrong: 1. For each slave node, I need to use pg_basebackup to sync and setup stream replication with the master node. 2. Setup Patroni on each node so that the leader ca...
I am building a HA Postgres db cluster with one master and multiple slaves. Please look at the below points and correct me if I am wrong:
1. For each slave node, I need to use pg_basebackup to sync and setup stream replication with the master node.
2. Setup Patroni on each node so that the leader can be elected from any of the available nodes in case of master failure (In this case, if the master fails, a slave node will be elected as master and write operations will be handled by it. Now if the master comes online and becomes the master of the cluster, then will the master sync with the previous master i.e. updated slaves).
3. Setup Etcd on a node and configure it in the Patroni files of each node.
Abizer Susnerwala
(1 rep)
Nov 28, 2023, 06:47 AM
• Last activity: Nov 28, 2023, 10:53 AM
0
votes
0
answers
2888
views
TimeZone Setting On PostgreSQL Databases
I have a **Patroni cluster** with two **Ubuntu 22 db server**. The timezone of my instances was set to **UTC**. I have needed to edit them to set the local timezone to **Europe/Istanbul**. I have made the timezone change on OS-level. When i checked the timezone setting with timedatectl command, i sa...
I have a **Patroni cluster** with two **Ubuntu 22 db server**. The timezone of my instances was set to **UTC**. I have needed to edit them to set the local timezone to **Europe/Istanbul**. I have made the timezone change on OS-level. When i checked the timezone setting with timedatectl command, i saw local time was correct.
And, I applied the setting one by one to all my databases on my server:
DATABASE "db_name" SET TIMEZONE TO "Europe/Istanbul";
Also applied the setting:
ALTER SYSTEM SET TimeZone TO 'Europe/Istanbul';
And i edited my patroni config file and entered timezone setting like:
: Europe/Istanbul
After that, i restarted my patroni cluster. After restarting my cluster, i saw that some of databases had applied the changes but other ones had not. I checked those databases configuration. The configuration was same with the applied ones.
I still see old timezone setting which was ETC/UTC when I execute sql command below:
timezone
I am a little bit confused about it. Strange situation. What might cause the situation? Do i miss something?
user274715
Sep 6, 2023, 11:19 AM
1
votes
2
answers
787
views
what after WAL LSN value of type pg_lsn reach its limit
I am trying to determine the optimal Standby for automated failover based on the highest value of `pg_last_wal_receive_lsn()` of all slaves. But what if `max_wal_size` is reached, does `pg_lsn` values go to 0? Please suggest if there's any better approach to determine node with the best RPO out of a...
I am trying to determine the optimal Standby for automated failover based on the highest value of
pg_last_wal_receive_lsn()
of all slaves. But what if max_wal_size
is reached, does pg_lsn
values go to 0?
Please suggest if there's any better approach to determine node with the best RPO out of all secondaries.
Lucky Shofar
(13 rep)
Oct 27, 2022, 08:33 AM
• Last activity: Jun 17, 2023, 06:33 AM
7
votes
1
answers
19011
views
Could not receive data from WAL stream: ERROR: requested WAL segment has already been removed
Configuration: Postgres 9.6 with a 3 cluster node. db1 is the master, db2 and db3 are replicas. WAL files are archived in AWS S3 using custom `pgrsync` tool. Cluster managed by patroni. The `archive_command` and `restore_command` is properly configured on all the nodes. To simulate: On db1, do heavy...
Configuration: Postgres 9.6 with a 3 cluster node. db1 is the master, db2 and db3 are replicas. WAL files are archived in AWS S3 using custom
pgrsync
tool. Cluster managed by patroni. The archive_command
and restore_command
is properly configured on all the nodes.
To simulate: On db1, do heavy writes (like vacuum a large table) and then stop db1 by sudo systemctl stop patroni
). db3 becomes the new leader. db2 requests more WAL files, which it gets via the proper restore command from AWS S3, becomes replica to db3.
Now, start db1 again by (sudo systemctl start patroni
). But db1 (the old leader and the new to-be-replica) never comes up as a replica and gives the error message:
>could not receive data from WAL stream: ERROR: requested WAL segment 0000002400053C55000000AE has already been removed.
This error message is reported by db3 (the leader), which db1 just logs it.
So, let's see the timeline. Initially db1 was in timeline 35 (0x23) and did write the following files to archive:
0000002300053C55000000AE (TL is 23 and archive at 53C55/AE)
..
0000002300053C5600000043 (TL is 23 and archive at 53C56/43)
db1 is stopped at this point.
db3's logs show this:
received promote request
redo done at 53C56/A3FFC8C0
selected new timeline ID: 36
database system is ready to accept connections
and db3 copies the following files to archives
00000024.history
0000002300053C56000000A3.partial
0000002400053C56000000A3
0000002200053C55000000AE
As db3 became leader, db2 starts the process to become replica to db3 (which it successfully becomes) and here is the summary of the logs:
restored log file "00000024.history" from archive
restored log file "0000002300053C55000000AE" from archive
restored log file "00000023.history" from archive
redo starts at 53C55/AE0001A8
restored log file "0000002300053C55000000AF" from archive
...
restored log file "0000002300053C5600000042" from archive
Retrieving 0000002300053C5600000043 to pg_xlog/RECOVERYXLOG FAILED (log by pgrsync)
...
Retrieving 0000002300053C56000000A2 to pg_xlog/RECOVERYXLOG FAILED
restored log file "0000002400053C56000000A3" from archive
consistent recovery state reached at 53C56/A3FFE900
db1 is started now and here are the logs:
LOG: database system was interrupted while in recovery at log time 2021-01-28 04:08:01 UTC
HINT: If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target
LOG: invalid record length at 53C55/AE0001E0: wanted 24, got 0
LOG: started streaming WAL from primary at 53C55/AE000000 on timeline 36
FATAL: could not receive data from WAL stream: ERROR: requested WAL segment 0000002400053C55000000AE has already been removed
... and repeats this over and over
Points to note:
1. 0000002400053C55000000AE
was never written to archives by any Postgres node. The old leader (db1) copied the archive 0000002300053C55000000AE
(note: 0023, not 0024) before it was stopped.
2. The new leader (db3) copied 0000002200053C55000000AE
(note: 0022, not 0024)
3. max_wal_size
is set to 1024 on all nodes.
4. After db3 became the new leader, there was hardly any activity on the nodes. db3 only writes WAL files every 10 mins (archive_timeout
=600s).
Questions:
1. Is there any thing wrong in the configuration that makes the the old leader asking for a WAL segment, which the new leader does not have?
2. How to restore the old leader (db1) at this state, without having to erase and start over?
Lots of disk space available. The problem can be simulated at will. Had tried pg_rewind
on the old primary, pointing to new primary. It just said it is already on the same timeline (not exact words). Note: It was not an error message. But even after that, it was showing the same error, when starting Postgres.
We are on 9.6.19. Just a few days back 9.6.21 (and last 9.5.x release) was announced which exactly points out fix for this specific problem (Fix WAL-reading logic so that standbys can handle timeline switches correctly. This issue could have shown itself with errors like "requested WAL segment has already been removed".). However, even after upgrading to 9.6.21, the same problem exists for us.
rsmoorthy
(157 rep)
Jan 28, 2021, 05:51 AM
• Last activity: Feb 26, 2023, 09:04 AM
0
votes
1
answers
9526
views
Issue with Patroni - postgres not starting
I am learning Postgres and I have installed 3 VMs as follows: node1 - Installed Postgres and etcd here node2 - Installed Postgres and etcd here node3 - Installed etcd here I have setup streaming replication between node1 and node2 and it worked fine (even etcd health showed as fine for all 3 nodes)....
I am learning Postgres and I have installed 3 VMs as follows:
node1 - Installed Postgres and etcd here
node2 - Installed Postgres and etcd here
node3 - Installed etcd here
I have setup streaming replication between node1 and node2 and it worked fine (even etcd health showed as fine for all 3 nodes).
Then I installed and configured Patroni on node1 and node2 and when I first started it node2 was promoted to master and node1 was acting as a slave. At the time I could access psql from both nodes but a change (I inserted a row in a table) I did in the master was not propagated to the slave.
I then stopped Patroni on node1 (slave) and restarted it but postgres on the slave is not starting and I cannot access psql on the slave either now. I don't know whether I should be able to access the DB from the slave or not when using Patroni (I could with streaming replication), i.e. should the postgres service remain switched off by design on the slave when you have Patroni? As from my understanding if it finds that there is a master holding a lock, postgres on the slave will fail to start.
I then stopped postgres on master as I thought that it will failover to node 1 (and node1 to become a master) but no automatic failover occurred. Furthermore, postgres service is not starting on either now (not even on node2 - but I can connect using psql on node2 - which still seems to be the master). Should postgres service be showing as running on the master or it also fails to start by design and one should only check the status of Patroni? Patroni is active and running on both nodes.
When I try to start postgres on node2 (the node holding the lock):
Log on node1 (the slave):
etcd is working fine:
From my research I found that pg_rewind might help to get the slave in sync with my master, however, I'm not sure whether that is even my issue.
What do I need to do to fix this situation?
1. I mainly need to start postgres on the master (I have no idea how I can use psql but postgres service is failing to start on the master).
2. I need the slave to be able to promote itself to master (through Patroni), if the master fails.
My config files seem fine (especially since it was working fine prior to trying to switch off one instance for a failover to occur)





user1930901
(349 rep)
May 27, 2021, 06:05 PM
• Last activity: Jan 7, 2023, 01:01 PM
2
votes
1
answers
1318
views
High Available PostgreSQL with Patroni and Replication
I want to build a highly available PostgreSQL cluster using patroni with etcd. I want to have 3 server (1 master and 2 slaves) in my cluster, so if something goes wrong one of the slaves automatically takes over the master role. Only the master server should accept write operators, the slaves are on...
I want to build a highly available PostgreSQL cluster using patroni with etcd. I want to have 3 server (1 master and 2 slaves) in my cluster, so if something goes wrong one of the slaves automatically takes over the master role. Only the master server should accept write operators, the slaves are only for safety and read operations.
On the other hand i also need approximately 10 servers with an up-to-date copy of the data on the cluster.
These servers are only there to read data and none of them ever can be promoted to a master server. They shouldn't even be part of the high available cluster.
They are just supposed to get the data using streaming replication etc.
Is such a setup even possible? Can i use replication to other servers when they are part of a patroni cluster?
Tobias
(21 rep)
Sep 1, 2020, 10:34 AM
• Last activity: Aug 4, 2022, 12:06 PM
1
votes
2
answers
1053
views
PostgreSQL-Patroni-Cluster: Installing oracle_fdw, dependency problem
For Oracle to PostgreSQL migrations we're currently looking into ora2pg. However, the performance using either writes to file or direct read from Oracle and write to PostgreSQL have both been very underwhelming. There's not really good information about that, while some parts I found suggested that...
For Oracle to PostgreSQL migrations we're currently looking into ora2pg. However, the performance using either writes to file or direct read from Oracle and write to PostgreSQL have both been very underwhelming. There's not really good information about that, while some parts I found suggested that it could be related to the Oracle-Perl-Driver.
In a new version of ora2pg it introduced a new way of converting the data using the Foreign Data Wrapper
oracle_fdw
. It is written in C and performance benchmarks seem to suggest a much higher performance. For that, we would like to use oracle_fdw
. The target for our conversion is a Patroni Cluster setup with 2 nodes and a 3rd witness node using etcd.
However, adding oracle_fdw
to the Patroni cluster has been difficult and so far unsuccessful. After compiling oracle_fdw, trying to load it gives the following message:
> ERROR: could not load library "/usr/lib/postgresql14/lib64/oracle_fdw.so": libclntsh.so.19.1: cannot open shared object file: No such file or directory
That library exists on the system, but in the directory of the Oracle Instantclient (/dbdata/oracle/instantclient_19_12
). For the library to be found, we think we would need the following standard environment variables added to the postgres
process that Patroni starts:
export LD_LIBRARY_PATH=/dbdata/oracle/instantclient_19_12
export ORACLE_HOME=/dbdata/oracle/instantclient_19_12
export PATH=/dbdata/oracle/instantclient_19_12:$PATH
However, we have been wholly unsuccessful to inject these variables into the postgress process started by Patroni. The process is started using the system user postgres
, as seen here:
dbserver01 root 6 (/dbdata/oracle/instantclient_19_12): ps -ef|grep postgres
root 9236 9120 0 07:56 pts/0 00:00:00 su - postgres
postgres 9238 9236 0 07:56 pts/0 00:00:00 -bash
postgres 9289 9238 0 07:56 pts/0 00:00:00 psql
postgres 9290 27443 0 07:56 ? 00:00:00 postgres: pgcluster2: postgres postgres [local] idle
root 9307 9180 0 07:58 pts/1 00:00:00 grep --color=auto postgres
postgres 27423 1 0 Mar23 ? 00:01:41 /opt/patroni/bin/python3 /usr/bin/patroni /etc/patroni/postgres.yml
postgres 27443 1 0 Mar23 ? 00:00:04 postgres -D /dbdata/pgcluster --config-file=/dbdata/pgcluster/postgresql.conf --listen_addresses=0.0.0.0 --port=5432 --cluster_name=pgcluster2 --wal_level=replica --hot_standby=on --max_connections=100 --max_wal_senders=10 --max_prepared_transactions=0 --max_locks_per_transaction=64 --track_commit_timestamp=off --max_replication_slots=10 --max_worker_processes=8 --wal_log_hints=on
The following methods have been tried:
* Adding the export statements to the .bashrc
of the postgres
user
* Adding the export statements to the .bash_profile
of the postgres
user
* Adding the export statements to the global /etc/profile
of the server
* Adding the /dbdata/oracle/instantclient_19_12/
path to the /etc/ld.so.conf
Each time Patroni was restarted, and so was Postgres (checked by the ever changing PID), yet the environment always looks like this:
dbserver01 root 7 (/dbdata/oracle/instantclient_19_12): cat /proc/27443/environ | xargs -0 -n 1
LANG=en_US.UTF-8
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
HOME=/var/lib/pgsql
LOGNAME=postgres
USER=postgres
SHELL=/bin/bash
INVOCATION_ID=3806a21dfa52455baad431cc2bbfa533
JOURNAL_STREAM=9:5426402
dbserver01 root 8 (/dbdata/oracle/instantclient_19_12):
The Patroni documentation has loads of information about Patroni specific environment variables, yet I have so far not found any documentation that helps pass variables to the Postgres process.
Copying the dependencies around might work around the problem, however, adding the variables would be the preferred way of fixing the problem. Is there a right or supported way to add environment variables to the postgres process started by Patroni?
Edit 1: The installation process was done using a checkout of the repository with the root
user, after setting the above mentioned exports, then compiled using make
and make install
. Here's the output of both commands currently:
dbserver01 root 25 (/tmp/oracle_fdw): make
make: Nothing to be done for 'all'.
goeccdb11l root 26 (/tmp/oracle_fdw): make install
/usr/bin/mkdir -p '/usr/lib/postgresql14/lib64'
/usr/bin/mkdir -p '/usr/share/postgresql14/extension'
/usr/bin/mkdir -p '/usr/share/postgresql14/extension'
/usr/bin/mkdir -p '/usr/share/doc/packages/postgresql14/extension'
/usr/bin/install -c -m 755 oracle_fdw.so '/usr/lib/postgresql14/lib64/oracle_fdw.so'
/usr/bin/install -c -m 644 .//oracle_fdw.control '/usr/share/postgresql14/extension/'
/usr/bin/install -c -m 644 .//oracle_fdw--1.2.sql .//oracle_fdw--1.0--1.1.sql .//oracle_fdw--1.1--1.2.sql '/usr/share/postgresql14/extension/'
/usr/bin/install -c -m 644 .//README.oracle_fdw '/usr/share/doc/packages/postgresql14/extension/'
dbserver01 root 27 (/tmp/oracle_fdw): ls -la /usr/lib/postgresql14/lib64
Edit 2: I've tried the installation again, using make clean
and the process of make
and make install
again:
dbserver01 root 11 (/tmp/oracle_fdw): make clean
rm -f oracle_fdw.so liboracle_fdw.a liboracle_fdw.pc
rm -f oracle_fdw.o oracle_utils.o oracle_gis.o oracle_fdw.bc oracle_utils.bc oracle_gis.bc
rm -rf results/ regression.diffs regression.out tmp_check/ tmp_check_iso/ log/ output_iso/
dbserver01 root 12 (/tmp/oracle_fdw): export LD_LIBRARY_PATH=/dbdata/oracle/instantclient_19_12
dbserver01 root 13 (/tmp/oracle_fdw): export ORACLE_HOME=/dbdata/oracle/instantclient_19_12
dbserver01 root 14 (/tmp/oracle_fdw): export PATH=/dbdata/oracle/instantclient_19_12:$PATH
dbserver01 root 15 (/tmp/oracle_fdw): make
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wimplicit-fallthrough=3 -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-format-truncation -fmessage-length=0 -grecord-gcc-switches -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-strong -funwind-tables -fasynchronous-unwind-tables -fstack-clash-protection -g -fPIC -I"/dbdata/oracle/instantclient_19_12/sdk/include" -I"/dbdata/oracle/instantclient_19_12/oci/include" -I"/dbdata/oracle/instantclient_19_12/rdbms/public" -I"/dbdata/oracle/instantclient_19_12/" -I/usr/include/oracle/21/client64 -I/usr/include/oracle/19.14/client64 -I/usr/include/oracle/19.12/client64 -I/usr/include/oracle/19.12/client -I/usr/include/oracle/19.11/client64 -I/usr/include/oracle/19.11/client -I/usr/include/oracle/19.10/client64 -I/usr/include/oracle/19.10/client -I/usr/include/oracle/19.9/client -I/usr/include/oracle/19.9/client64 -I/usr/include/oracle/19.8/client -I/usr/include/oracle/19.8/client64 -I/usr/include/oracle/19.6/client -I/usr/include/oracle/19.6/client64 -I/usr/include/oracle/19.3/client -I/usr/include/oracle/19.3/client64 -I/usr/include/oracle/18.5/client -I/usr/include/oracle/18.5/client64 -I/usr/include/oracle/18.3/client -I/usr/include/oracle/18.3/client64 -I/usr/include/oracle/12.2/client -I/usr/include/oracle/12.2/client64 -I/usr/include/oracle/12.1/client -I/usr/include/oracle/12.1/client64 -I/usr/include/oracle/11.2/client -I/usr/include/oracle/11.2/client64 -I. -I./ -I/usr/include/pgsql/server -I/usr/include/pgsql/internal -D_GNU_SOURCE -I/usr/include/libxml2 -c -o oracle_fdw.o oracle_fdw.c
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wimplicit-fallthrough=3 -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-format-truncation -fmessage-length=0 -grecord-gcc-switches -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-strong -funwind-tables -fasynchronous-unwind-tables -fstack-clash-protection -g -fPIC -I"/dbdata/oracle/instantclient_19_12/sdk/include" -I"/dbdata/oracle/instantclient_19_12/oci/include" -I"/dbdata/oracle/instantclient_19_12/rdbms/public" -I"/dbdata/oracle/instantclient_19_12/" -I/usr/include/oracle/21/client64 -I/usr/include/oracle/19.14/client64 -I/usr/include/oracle/19.12/client64 -I/usr/include/oracle/19.12/client -I/usr/include/oracle/19.11/client64 -I/usr/include/oracle/19.11/client -I/usr/include/oracle/19.10/client64 -I/usr/include/oracle/19.10/client -I/usr/include/oracle/19.9/client -I/usr/include/oracle/19.9/client64 -I/usr/include/oracle/19.8/client -I/usr/include/oracle/19.8/client64 -I/usr/include/oracle/19.6/client -I/usr/include/oracle/19.6/client64 -I/usr/include/oracle/19.3/client -I/usr/include/oracle/19.3/client64 -I/usr/include/oracle/18.5/client -I/usr/include/oracle/18.5/client64 -I/usr/include/oracle/18.3/client -I/usr/include/oracle/18.3/client64 -I/usr/include/oracle/12.2/client -I/usr/include/oracle/12.2/client64 -I/usr/include/oracle/12.1/client -I/usr/include/oracle/12.1/client64 -I/usr/include/oracle/11.2/client -I/usr/include/oracle/11.2/client64 -I. -I./ -I/usr/include/pgsql/server -I/usr/include/pgsql/internal -D_GNU_SOURCE -I/usr/include/libxml2 -c -o oracle_utils.o oracle_utils.c
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wimplicit-fallthrough=3 -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-format-truncation -fmessage-length=0 -grecord-gcc-switches -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-strong -funwind-tables -fasynchronous-unwind-tables -fstack-clash-protection -g -fPIC -I"/dbdata/oracle/instantclient_19_12/sdk/include" -I"/dbdata/oracle/instantclient_19_12/oci/include" -I"/dbdata/oracle/instantclient_19_12/rdbms/public" -I"/dbdata/oracle/instantclient_19_12/" -I/usr/include/oracle/21/client64 -I/usr/include/oracle/19.14/client64 -I/usr/include/oracle/19.12/client64 -I/usr/include/oracle/19.12/client -I/usr/include/oracle/19.11/client64 -I/usr/include/oracle/19.11/client -I/usr/include/oracle/19.10/client64 -I/usr/include/oracle/19.10/client -I/usr/include/oracle/19.9/client -I/usr/include/oracle/19.9/client64 -I/usr/include/oracle/19.8/client -I/usr/include/oracle/19.8/client64 -I/usr/include/oracle/19.6/client -I/usr/include/oracle/19.6/client64 -I/usr/include/oracle/19.3/client -I/usr/include/oracle/19.3/client64 -I/usr/include/oracle/18.5/client -I/usr/include/oracle/18.5/client64 -I/usr/include/oracle/18.3/client -I/usr/include/oracle/18.3/client64 -I/usr/include/oracle/12.2/client -I/usr/include/oracle/12.2/client64 -I/usr/include/oracle/12.1/client -I/usr/include/oracle/12.1/client64 -I/usr/include/oracle/11.2/client -I/usr/include/oracle/11.2/client64 -I. -I./ -I/usr/include/pgsql/server -I/usr/include/pgsql/internal -D_GNU_SOURCE -I/usr/include/libxml2 -c -o oracle_gis.o oracle_gis.c
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wimplicit-fallthrough=3 -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-format-truncation -fmessage-length=0 -grecord-gcc-switches -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-strong -funwind-tables -fasynchronous-unwind-tables -fstack-clash-protection -g -fPIC -shared -o oracle_fdw.so oracle_fdw.o oracle_utils.o oracle_gis.o -L/usr/lib/postgresql14/lib64 -L/usr/lib64 -Wl,--as-needed -L"/dbdata/oracle/instantclient_19_12/" -L"/dbdata/oracle/instantclient_19_12/bin" -L"/dbdata/oracle/instantclient_19_12/lib" -L"/dbdata/oracle/instantclient_19_12/lib/amd64" -lclntsh -L/usr/lib/oracle/21/client64/lib -L/usr/lib/oracle/19.14/client64/lib -L/usr/lib/oracle/19.12/client64/lib -L/usr/lib/oracle/19.12/client/lib -L/usr/lib/oracle/19.11/client64/lib -L/usr/lib/oracle/19.11/client/lib -L/usr/lib/oracle/19.10/client64/lib -L/usr/lib/oracle/19.10/client/lib -L/usr/lib/oracle/19.9/client/lib -L/usr/lib/oracle/19.9/client64/lib -L/usr/lib/oracle/19.8/client/lib -L/usr/lib/oracle/19.8/client64/lib -L/usr/lib/oracle/19.6/client/lib -L/usr/lib/oracle/19.6/client64/lib -L/usr/lib/oracle/19.3/client/lib -L/usr/lib/oracle/19.3/client64/lib -L/usr/lib/oracle/18.5/client/lib -L/usr/lib/oracle/18.5/client64/lib -L/usr/lib/oracle/18.3/client/lib -L/usr/lib/oracle/18.3/client64/lib -L/usr/lib/oracle/12.2/client/lib -L/usr/lib/oracle/12.2/client64/lib -L/usr/lib/oracle/12.1/client/lib -L/usr/lib/oracle/12.1/client64/lib -L/usr/lib/oracle/11.2/client/lib -L/usr/lib/oracle/11.2/client64/lib
dbserver01 root 16 (/tmp/oracle_fdw): make install
/usr/bin/mkdir -p '/usr/lib/postgresql14/lib64'
/usr/bin/mkdir -p '/usr/share/postgresql14/extension'
/usr/bin/mkdir -p '/usr/share/postgresql14/extension'
/usr/bin/mkdir -p '/usr/share/doc/packages/postgresql14/extension'
/usr/bin/install -c -m 755 oracle_fdw.so '/usr/lib/postgresql14/lib64/oracle_fdw.so'
/usr/bin/install -c -m 644 .//oracle_fdw.control '/usr/share/postgresql14/extension/'
/usr/bin/install -c -m 644 .//oracle_fdw--1.2.sql .//oracle_fdw--1.0--1.1.sql .//oracle_fdw--1.1--1.2.sql '/usr/share/postgresql14/extension/'
/usr/bin/install -c -m 644 .//README.oracle_fdw '/usr/share/doc/packages/postgresql14/extension/'
dbserver01 root 17 (/tmp/oracle_fdw):
Edit 3: Added a 4th method to get the path into the postgres user process.
0xCAFEBABE
(217 rep)
Mar 25, 2022, 07:24 AM
• Last activity: Mar 28, 2022, 08:07 AM
1
votes
1
answers
118
views
Many PG Clusters on more than one machine
I want to run ~10k PostgreSQL clusters. Each cluster will be quite small with moderate traffic. I have will have roughly 100 virtual machines. Each VM should run roughly 100 clusters. Each cluster should run in one container. 100 VMs. Each running 100 containers. One cluster per container --> 10k cl...
I want to run ~10k PostgreSQL clusters.
Each cluster will be quite small with moderate traffic.
I have will have roughly 100 virtual machines. Each VM should run roughly 100 clusters. Each cluster should run in one container.
100 VMs. Each running 100 containers. One cluster per container --> 10k clusters
Features which are not available from PG out-of-the-box:
* If I create a new cluster I need to check: Which VM has room for a new cluster?
* I want to be able to move a cluster to a different VM
Before I re-invent the wheel: Is there already a PostgreSQL-way to handle such a scenario?
Does it make sense to use a [Kubernetes Postgres-Operator](https://blog.flant.com/comparing-kubernetes-operators-for-postgresql/) or Patroni?
Background: It will be some kind of open source learning/teaching environment. Each user should get full database superuser permissions. That's why it is not enough to give each user a database. Each user needs a cluster. At least that's my current impression. Please leave a comment if you have an idea for a different architecture.
guettli
(1591 rep)
Jan 6, 2022, 07:30 AM
• Last activity: Jan 6, 2022, 07:15 PM
1
votes
1
answers
557
views
Could Barman backup only one database from PostgreSQL?
In my system, I have postgresql 12 and barman v2.13. I have a few backup disk around 1 TB and i have currently added new DB which has 5 TB. Previously my PostgreSQL had no more than 400 GB. i think Barman can not compress backup DB, right? So could i take backup only specific DBs? Is pgbacktress sim...
In my system, I have postgresql 12 and barman v2.13. I have a few backup disk around 1 TB and i have currently added new DB which has 5 TB. Previously my PostgreSQL had no more than 400 GB.
i think Barman can not compress backup DB, right? So could i take backup only specific DBs? Is pgbacktress similar?
Melih
(284 rep)
Aug 1, 2021, 10:32 PM
• Last activity: Nov 22, 2021, 05:28 PM
1
votes
1
answers
712
views
PostgreSQL process stuck in a "forked new backend" loop when used with Patroni
I have a 3 node Patroni2.1.1 + PGSQL13.4 cluster and one of the nodes is acting weird once it finished the basebackup bootstrap. I have another identical node working fine when put thru the same process. Basically once Patroni tries to connect to localhost:5432 the server seems to die and refork (fr...
I have a 3 node Patroni2.1.1 + PGSQL13.4 cluster and one of the nodes is acting weird once it finished the basebackup bootstrap. I have another identical node working fine when put thru the same process.
Basically once Patroni tries to connect to localhost:5432 the server seems to die and refork (from my understanding). You'll see below the "loop" starting at
connection received: host=::1 port=XXXXX
.
2021-11-01 12:01:43.395 EDT LOG: starting PostgreSQL 13.4 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.4.1 20200928 (Red Hat 8.4.1-1), 64-bit
2021-11-01 12:01:43.396 EDT LOG: listening on IPv6 address "::1", port 5432
2021-11-01 12:01:43.396 EDT LOG: listening on IPv4 address "127.0.0.1", port 5432
2021-11-01 12:01:43.397 EDT LOG: listening on IPv4 address "192.168.23.13", port 5432
2021-11-01 12:01:43.398 EDT LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2021-11-01 12:01:43.401 EDT LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
2021-11-01 12:01:43.405 EDT LOG: database system was interrupted; last known up at 2021-11-01 12:01:25 EDT
2021-11-01 12:01:43.405 EDT DEBUG: removing all temporary WAL segments
2021-11-01 12:01:43.436 EDT DEBUG: executing restore command "cp /var/lib/pgsql/13/archive/00000077.history pg_wal/RECOVERYHISTORY"
cp: cannot stat '/var/lib/pgsql/13/archive/00000077.history': No such file or directory
2021-11-01 12:01:43.440 EDT DEBUG: could not restore file "00000077.history" from archive: child process exited with exit code 1
2021-11-01 12:01:43.441 EDT LOG: entering standby mode
2021-11-01 12:01:43.441 EDT DEBUG: backup time 2021-11-01 12:01:25 EDT in file "backup_label"
2021-11-01 12:01:43.441 EDT DEBUG: backup label pg_basebackup base backup in file "backup_label"
2021-11-01 12:01:43.441 EDT DEBUG: backup timeline 118 in file "backup_label"
2021-11-01 12:01:43.441 EDT DEBUG: executing restore command "cp /var/lib/pgsql/13/archive/00000076.history pg_wal/RECOVERYHISTORY"
2021-11-01 12:01:43.538 EDT LOG: restored log file "00000076.history" from archive
2021-11-01 12:01:43.540 EDT DEBUG: executing restore command "cp /var/lib/pgsql/13/archive/000000760000000100000000 pg_wal/RECOVERYXLOG"
2021-11-01 12:01:43.689 EDT LOG: restored log file "000000760000000100000000" from archive
2021-11-01 12:01:43.692 EDT DEBUG: got WAL segment from archive
2021-11-01 12:01:43.693 EDT DEBUG: checkpoint record is at 1/60
2021-11-01 12:01:43.693 EDT DEBUG: redo record is at 1/28; shutdown false
2021-11-01 12:01:43.693 EDT DEBUG: next transaction ID: 531; next OID: 24606
2021-11-01 12:01:43.693 EDT DEBUG: next MultiXactId: 1; next MultiXactOffset: 0
2021-11-01 12:01:43.693 EDT DEBUG: oldest unfrozen transaction ID: 478, in database 1
2021-11-01 12:01:43.693 EDT DEBUG: oldest MultiXactId: 1, in database 1
2021-11-01 12:01:43.693 EDT DEBUG: commit timestamp Xid oldest/newest: 0/0
2021-11-01 12:01:43.693 EDT DEBUG: transaction ID wrap limit is 2147484125, limited by database with OID 1
2021-11-01 12:01:43.693 EDT DEBUG: MultiXactId wrap limit is 2147483648, limited by database with OID 1
2021-11-01 12:01:43.693 EDT DEBUG: starting up replication slots
2021-11-01 12:01:43.693 EDT DEBUG: starting up replication origin progress state
2021-11-01 12:01:43.697 EDT DEBUG: resetting unlogged relations: cleanup 1 init 0
2021-11-01 12:01:43.698 EDT DEBUG: initializing for hot standby
2021-11-01 12:01:43.698 EDT DEBUG: my backend ID is 1
2021-11-01 12:01:43.698 EDT LOG: redo starts at 1/28
2021-11-01 12:01:43.698 EDT DEBUG: prune KnownAssignedXids to 531
2021-11-01 12:01:43.698 EDT CONTEXT: WAL redo at 1/28 for Standby/RUNNING_XACTS: nextXid 531 latestCompletedXid 530 oldestRunningXid 531
2021-11-01 12:01:43.698 EDT DEBUG: 0 KnownAssignedXids (num=0 tail=0 head=0)
2021-11-01 12:01:43.698 EDT CONTEXT: WAL redo at 1/28 for Standby/RUNNING_XACTS: nextXid 531 latestCompletedXid 530 oldestRunningXid 531
2021-11-01 12:01:43.698 EDT DEBUG: recovery snapshots are now enabled
2021-11-01 12:01:43.698 EDT CONTEXT: WAL redo at 1/28 for Standby/RUNNING_XACTS: nextXid 531 latestCompletedXid 530 oldestRunningXid 531
2021-11-01 12:01:43.698 EDT DEBUG: prune KnownAssignedXids to 531
2021-11-01 12:01:43.698 EDT CONTEXT: WAL redo at 1/D8 for Standby/RUNNING_XACTS: nextXid 531 latestCompletedXid 530 oldestRunningXid 531
2021-11-01 12:01:43.698 EDT DEBUG: end of backup reached
2021-11-01 12:01:43.698 EDT CONTEXT: WAL redo at 1/110 for XLOG/BACKUP_END: 1/28
2021-11-01 12:01:43.699 EDT DEBUG: checkpointer updated shared memory configuration values
2021-11-01 12:01:43.700 EDT LOG: consistent recovery state reached at 1/138
2021-11-01 12:01:43.700 EDT DEBUG: executing restore command "cp /var/lib/pgsql/13/archive/000000760000000100000001 pg_wal/RECOVERYXLOG"
2021-11-01 12:01:43.700 EDT LOG: database system is ready to accept read only connections
cp: cannot stat '/var/lib/pgsql/13/archive/000000760000000100000001': No such file or directory
2021-11-01 12:01:43.704 EDT DEBUG: could not restore file "000000760000000100000001" from archive: child process exited with exit code 1
2021-11-01 12:01:43.704 EDT DEBUG: could not open file "pg_wal/000000760000000100000001": No such file or directory
2021-11-01 12:01:43.704 EDT DEBUG: switched WAL source from archive to stream after failure
2021-11-01 12:01:43.705 EDT DEBUG: find_in_dynamic_libpath: trying "/usr/pgsql-13/lib/libpqwalreceiver"
2021-11-01 12:01:43.706 EDT DEBUG: find_in_dynamic_libpath: trying "/usr/pgsql-13/lib/libpqwalreceiver.so"
2021-11-01 12:01:44.370 EDT DEBUG: forked new backend, pid=242938 socket=12
2021-11-01 12:01:44.371 EDT LOG: connection received: host=::1 port=36714
2021-11-01 12:01:51.235 EDT DEBUG: shmem_exit(0): 0 before_shmem_exit callbacks to make
2021-11-01 12:01:51.235 EDT DEBUG: shmem_exit(0): 0 on_shmem_exit callbacks to make
2021-11-01 12:01:51.235 EDT DEBUG: proc_exit(0): 1 callbacks to make
2021-11-01 12:01:51.235 EDT DEBUG: exit(0)
2021-11-01 12:01:51.235 EDT DEBUG: shmem_exit(-1): 0 before_shmem_exit callbacks to make
2021-11-01 12:01:51.235 EDT DEBUG: shmem_exit(-1): 0 on_shmem_exit callbacks to make
2021-11-01 12:01:51.235 EDT DEBUG: proc_exit(-1): 0 callbacks to make
2021-11-01 12:01:51.236 EDT DEBUG: forked new backend, pid=242941 socket=12
2021-11-01 12:01:51.237 EDT LOG: connection received: host=127.0.0.1 port=43428
2021-11-01 12:01:51.237 EDT DEBUG: reaping dead processes
2021-11-01 12:01:51.237 EDT DEBUG: server process (PID 242938) exited with exit code 0
2021-11-01 12:01:58.284 EDT DEBUG: shmem_exit(0): 0 before_shmem_exit callbacks to make
2021-11-01 12:01:58.284 EDT DEBUG: shmem_exit(0): 0 on_shmem_exit callbacks to make
2021-11-01 12:01:58.284 EDT DEBUG: proc_exit(0): 1 callbacks to make
2021-11-01 12:01:58.284 EDT DEBUG: exit(0)
2021-11-01 12:01:58.284 EDT DEBUG: shmem_exit(-1): 0 before_shmem_exit callbacks to make
2021-11-01 12:01:58.284 EDT DEBUG: shmem_exit(-1): 0 on_shmem_exit callbacks to make
2021-11-01 12:01:58.284 EDT DEBUG: proc_exit(-1): 0 callbacks to make
2021-11-01 12:01:58.285 EDT DEBUG: reaping dead processes
2021-11-01 12:01:58.285 EDT DEBUG: server process (PID 242941) exited with exit code 0
2021-11-01 12:01:59.295 EDT DEBUG: forked new backend, pid=242944 socket=12
2021-11-01 12:01:59.295 EDT LOG: connection received: host=::1 port=36758
2021-11-01 12:02:06.301 EDT DEBUG: shmem_exit(0): 0 before_shmem_exit callbacks to make
2021-11-01 12:02:06.301 EDT DEBUG: shmem_exit(0): 0 on_shmem_exit callbacks to make
2021-11-01 12:02:06.301 EDT DEBUG: proc_exit(0): 1 callbacks to make
2021-11-01 12:02:06.301 EDT DEBUG: exit(0)
2021-11-01 12:02:06.301 EDT DEBUG: shmem_exit(-1): 0 before_shmem_exit callbacks to make
2021-11-01 12:02:06.301 EDT DEBUG: shmem_exit(-1): 0 on_shmem_exit callbacks to make
2021-11-01 12:02:06.301 EDT DEBUG: proc_exit(-1): 0 callbacks to make
2021-11-01 12:02:06.301 EDT DEBUG: forked new backend, pid=242948 socket=12
2021-11-01 12:02:06.302 EDT DEBUG: reaping dead processes
2021-11-01 12:02:06.302 EDT DEBUG: server process (PID 242944) exited with exit code 0
2021-11-01 12:02:06.302 EDT LOG: connection received: host=127.0.0.1 port=43470
2021-11-01 12:02:13.440 EDT DEBUG: shmem_exit(0): 0 before_shmem_exit callbacks to make
2021-11-01 12:02:13.440 EDT DEBUG: shmem_exit(0): 0 on_shmem_exit callbacks to make
2021-11-01 12:02:13.440 EDT DEBUG: proc_exit(0): 1 callbacks to make
2021-11-01 12:02:13.440 EDT DEBUG: exit(0)
2021-11-01 12:02:13.440 EDT DEBUG: shmem_exit(-1): 0 before_shmem_exit callbacks to make
2021-11-01 12:02:13.440 EDT DEBUG: shmem_exit(-1): 0 on_shmem_exit callbacks to make
2021-11-01 12:02:13.440 EDT DEBUG: proc_exit(-1): 0 callbacks to make
2021-11-01 12:02:13.441 EDT DEBUG: reaping dead processes
2021-11-01 12:02:13.441 EDT DEBUG: server process (PID 242948) exited with exit code 0
2021-11-01 12:02:14.148 EDT DEBUG: postmaster received signal 2
2021-11-01 12:02:14.148 EDT LOG: received fast shutdown request
2021-11-01 12:02:14.150 EDT LOG: aborting any active transactions
2021-11-01 12:02:14.150 EDT DEBUG: shmem_exit(0): 1 before_shmem_exit callbacks to make
2021-11-01 12:02:14.150 EDT DEBUG: shmem_exit(0): 5 on_shmem_exit callbacks to make
2021-11-01 12:02:14.150 EDT DEBUG: proc_exit(0): 2 callbacks to make
2021-11-01 12:02:14.150 EDT DEBUG: exit(0)
2021-11-01 12:02:14.150 EDT DEBUG: shmem_exit(-1): 0 before_shmem_exit callbacks to make
2021-11-01 12:02:14.150 EDT DEBUG: shmem_exit(-1): 0 on_shmem_exit callbacks to make
2021-11-01 12:02:14.150 EDT DEBUG: proc_exit(-1): 0 callbacks to make
2021-11-01 12:02:14.150 EDT DEBUG: shmem_exit(1): 1 before_shmem_exit callbacks to make
2021-11-01 12:02:14.150 EDT DEBUG: shmem_exit(1): 7 on_shmem_exit callbacks to make
2021-11-01 12:02:14.150 EDT DEBUG: removing all KnownAssignedXids
2021-11-01 12:02:14.150 EDT DEBUG: release all standby locks
2021-11-01 12:02:14.150 EDT DEBUG: proc_exit(1): 2 callbacks to make
2021-11-01 12:02:14.150 EDT DEBUG: exit(1)
2021-11-01 12:02:14.150 EDT DEBUG: shmem_exit(-1): 0 before_shmem_exit callbacks to make
2021-11-01 12:02:14.150 EDT DEBUG: shmem_exit(-1): 0 on_shmem_exit callbacks to make
2021-11-01 12:02:14.150 EDT DEBUG: proc_exit(-1): 0 callbacks to make
2021-11-01 12:02:14.151 EDT DEBUG: reaping dead processes
2021-11-01 12:02:14.151 EDT DEBUG: reaping dead processes
2021-11-01 12:02:14.449 EDT DEBUG: forked new backend, pid=242954 socket=12
2021-11-01 12:02:14.450 EDT LOG: connection received: host=::1 port=36798
2021-11-01 12:02:15.915 EDT FATAL: terminating walreceiver process due to administrator command
2021-11-01 12:02:15.915 EDT DEBUG: shmem_exit(1): 1 before_shmem_exit callbacks to make
2021-11-01 12:02:15.915 EDT DEBUG: shmem_exit(1): 6 on_shmem_exit callbacks to make
2021-11-01 12:02:15.915 EDT DEBUG: proc_exit(1): 2 callbacks to make
2021-11-01 12:02:15.915 EDT DEBUG: exit(1)
2021-11-01 12:02:15.916 EDT DEBUG: shmem_exit(-1): 0 before_shmem_exit callbacks to make
2021-11-01 12:02:15.916 EDT DEBUG: shmem_exit(-1): 0 on_shmem_exit callbacks to make
2021-11-01 12:02:15.916 EDT DEBUG: proc_exit(-1): 0 callbacks to make
2021-11-01 12:02:15.917 EDT DEBUG: reaping dead processes
2021-11-01 12:02:15.917 EDT LOG: shutting down
2021-11-01 12:02:15.917 EDT DEBUG: skipping restartpoint, already performed at 1/28
2021-11-01 12:02:15.919 EDT DEBUG: updated min recovery point to 1/1000000 on timeline 118
2021-11-01 12:02:15.920 EDT DEBUG: shmem_exit(0): 1 before_shmem_exit callbacks to make
2021-11-01 12:02:15.920 EDT DEBUG: shmem_exit(0): 5 on_shmem_exit callbacks to make
2021-11-01 12:02:15.920 EDT DEBUG: proc_exit(0): 2 callbacks to make
2021-11-01 12:02:15.920 EDT DEBUG: exit(0)
2021-11-01 12:02:15.920 EDT DEBUG: shmem_exit(-1): 0 before_shmem_exit callbacks to make
2021-11-01 12:02:15.920 EDT DEBUG: shmem_exit(-1): 0 on_shmem_exit callbacks to make
2021-11-01 12:02:15.920 EDT DEBUG: proc_exit(-1): 0 callbacks to make
2021-11-01 12:02:15.921 EDT DEBUG: reaping dead processes
2021-11-01 12:02:15.921 EDT DEBUG: writing stats file "pg_stat/global.stat"
2021-11-01 12:02:15.921 EDT DEBUG: shmem_exit(-1): 0 before_shmem_exit callbacks to make
2021-11-01 12:02:15.921 EDT DEBUG: shmem_exit(-1): 0 on_shmem_exit callbacks to make
2021-11-01 12:02:15.921 EDT DEBUG: proc_exit(-1): 0 callbacks to make
2021-11-01 12:02:15.922 EDT DEBUG: reaping dead processes
2021-11-01 12:02:17.890 EDT DEBUG: postmaster received signal 2
2021-11-01 12:02:18.617 EDT DEBUG: shmem_exit(0): 0 before_shmem_exit callbacks to make
2021-11-01 12:02:18.617 EDT DEBUG: shmem_exit(0): 0 on_shmem_exit callbacks to make
2021-11-01 12:02:18.617 EDT DEBUG: proc_exit(0): 1 callbacks to make
2021-11-01 12:02:18.617 EDT DEBUG: exit(0)
2021-11-01 12:02:18.617 EDT DEBUG: shmem_exit(-1): 0 before_shmem_exit callbacks to make
2021-11-01 12:02:18.617 EDT DEBUG: shmem_exit(-1): 0 on_shmem_exit callbacks to make
2021-11-01 12:02:18.617 EDT DEBUG: proc_exit(-1): 0 callbacks to make
2021-11-01 12:02:18.618 EDT DEBUG: reaping dead processes
2021-11-01 12:02:18.618 EDT DEBUG: server process (PID 242954) exited with exit code 0
2021-11-01 12:02:18.618 EDT DEBUG: shmem_exit(0): 0 before_shmem_exit callbacks to make
2021-11-01 12:02:18.618 EDT DEBUG: shmem_exit(0): 5 on_shmem_exit callbacks to make
2021-11-01 12:02:18.618 EDT DEBUG: cleaning up dynamic shared memory control segment with ID 2104592565
2021-11-01 12:02:18.631 EDT DEBUG: proc_exit(0): 2 callbacks to make
2021-11-01 12:02:18.631 EDT LOG: database system is shut down
2021-11-01 12:02:18.631 EDT DEBUG: exit(0)
2021-11-01 12:02:18.631 EDT DEBUG: shmem_exit(-1): 0 before_shmem_exit callbacks to make
2021-11-01 12:02:18.631 EDT DEBUG: shmem_exit(-1): 0 on_shmem_exit callbacks to make
2021-11-01 12:02:18.631 EDT DEBUG: proc_exit(-1): 0 callbacks to make
2021-11-01 12:02:18.633 EDT DEBUG: logger shutting down
2021-11-01 12:02:18.633 EDT DEBUG: shmem_exit(0): 0 before_shmem_exit callbacks to make
2021-11-01 12:02:18.633 EDT DEBUG: shmem_exit(0): 0 on_shmem_exit callbacks to make
2021-11-01 12:02:18.633 EDT DEBUG: proc_exit(0): 0 callbacks to make
2021-11-01 12:02:18.633 EDT DEBUG: exit(0)
2021-11-01 12:02:18.633 EDT DEBUG: shmem_exit(-1): 0 before_shmem_exit callbacks to make
2021-11-01 12:02:18.633 EDT DEBUG: shmem_exit(-1): 0 on_shmem_exit callbacks to make
2021-11-01 12:02:18.633 EDT DEBUG: proc_exit(-1): 0 callbacks to make
If I stop Patroni, the server starts running fine and joins the replication as expected. The user postgres can connect just fine locally or remotely using the psql utility or even using pgadmin.
Could someone explain what's going on here? I'm still learning PGSQL and I'm not quite comfortable with those logs yet... Thank you!
For more context you can also look at Patroni issue #2104 I posted.
JulioQc
(143 rep)
Nov 1, 2021, 04:20 PM
• Last activity: Nov 3, 2021, 05:14 PM
Showing page 1 of 20 total questions