Sample Header Ad - 728x90

Acquiring conneciton from pgxpool takes longer with PGBouncer

1 vote
0 answers
67 views
My web server is deployed on Kubernetes with horizontal pod scaling and a separate, non auto-scaling, PostgreSQL service which runs a both a master and readonly replica nodes, with high-availability on failover. The web server is written in Go and uses PGX v5 for all interactions with PostgreSQL. This includes creating a connection pool (pgxpool) on pod startup, with N connections. N is such a number that when multiplied by the maximum number of pods is smaller than the maximum connections allowed by the PostgreSQL database. Usually N is between 10 and 20. In an effort to try and optimize performance under load, I notice that there is some contention over connection acquisition on pods, namely pool.acquire inside pgx is taking longer under high load. My assumption is that this is due to having the limit on N concurrent open transactions from the pod to the database. With each concurrent transaction (each request to the web server uses one transaction while it's handled) 'hogging' one connection, so whenever a new request to the server arrives, it needs to wait for one of the open transaction to finish if all N connections are in use. This scenario makes sense, especially when the HPA is scaled down as there are less pods and the same amount of maximum database connections (N) per pod. I'm experimenting with PGBouncer in an effort to address this but I see some unexpected results. My hypothesis was that enabling PGBouncer in 'transaction' mode would allow me to increase N, since the actual management of connection is abstracted away to PGBouncer. I configured PGBouncer to have about 80% of the max connections the DB can take (680 outnof 800) and increased N by a multiply of 5 (from 10 to 50). Looking at database metrics, I do observe less open connections to the database which is expected due to PGBouncer using connections in a smarter way. However, I also observe two unexpected behaviors: Increased CPU on the database. I imagine that this is due to PGBouncer process using resources, but it's in some scenarios it is not something I can accept. I think that perhaps reducing the number of connections dedicated to PGBouncer from 80% to a lower figure would help. The average duration of pool.acquire on the web server increased rather than decreasing. This is despite of increasing N to be sufficiently high - that is higher than the number of incoming concurrent transactions from each pod to the database. I've a strong feeling I'm doing something wrong. Should I stop using pgxpool in tandem with PGBouncer? Where can I look to pinpoint the reason for increased duration of connection acquisition from pgxpool?
Asked by Alechko (229 rep)
Apr 15, 2025, 06:09 PM