Can network errors with client cause Postgres server to log long query
0
votes
1
answer
36
views
We have a query that intermittently runs very poorly for no apparent reason.
We have Postgres 14.13 configured with
pg.log_min_duration_statement
set to 30000
.
We have a query that runs as part of our core hosting service that is basically
SELECT * FROM table
LEFT OUTER JOIN table2 ON table.table2_id = table2_id
WHERE "table"."url" = ? LIMIT 1;
This runs thousands of times every hour, and is fine, but then occasionally we'll get a log saying it took **minutes**.
Here's an example log
pid=864204,user=service_account,db=app_db,app=pg-node,client=XX.XX.XX.XX LOG: duration: 156414.095 ms statement: SELECT * FROM table LEFT OUTER JOIN table2 ON table.table2_id = table2_id WHERE "table"."url" = ? LIMIT 1;
Here's an example log for the auto_explain
pid=2377906,user=service_account,db=app_database,app=pg-node,client=XX.XX.XX.XX LOG: duration: 156414.095 ms plan:
Query Text: SELECT * ...
Limit (cost=5.29..8.33 rows=1 width=741) (actual time=0.153..0.157 rows=1 loops=1
-> Nested Loop Left Join (cost=5.29..8.33 rows=1 width=741) (actual time=0.152..0.154 rows=1 loops=1
-> Bitmap Heap Scan on XXXX (cost=5.01..6.02 rows=1 width=546) (actual time=0.112..0.113 rows=1 loops=1)
Recheck Cond: (XXXXXX)
Filter: ((XXXXX) AND (XXXXX))
Heap Blocks: exact=1
Bitmap Index Scan on XXXXX (cost=0.00..5.01 rows=1 width=0) (actual time=0.097..0.097 rows=1 loops=1)
Index Cond: (XXXXX)
-> Index Scan using XXXX on table (cost=0.29..2.30 rows=1 width=422) (actual time=0.037..0.037 rows=1 loops=1)
Index Cond: (XXXXX)
Filter: (XXXXX)
Metrics show no extra load on the server (CPU is below 20%, memory is unchanged, there's no spike in other queries).
We turned on auto_explain
to try and track this down, but the analysis shows that all
parts of the query took <1ms
The database where this is occurring is a read replica.
We've contacted support for the hosting service providing the database. They've suggested that network errors with the connection to the client could be causing the issue.
Is this possible? If not, what other explanations could there be?
Asked by ChrisJ
(621 rep)
Sep 24, 2024, 11:47 AM
Last activity: Sep 25, 2024, 07:33 PM
Last activity: Sep 25, 2024, 07:33 PM