Can you use PREPARE statement or plpgsql function to make first query on new connection fast?
1
vote
1
answer
553
views
Currently, I am standing up an RShiny (web) application that queries data from an AWS RDS PostgresSQL database (PostgresSQL 12) which is pushed forward to my application. When a new user logs into the website, they establish a new connection to our database (important). I have spent a lot of time optimizing our database structure underneath the website to improve query performance ranging from indexing, reading and optimizing query plans, re-organizing table structures etc.. Unfortunately, I am not achieving the success I need. What I have found is when a query is performed, it is often slow because there is no cached information about the optimal plan. So, when the query is run the first time the data retrieval is slow. However, the second time it is **much quicker** / runs as I have optimized it.
Recently, I have been exploring using
PREPARE
statements with much better success. However, I am running into issues where:
1. The cached PREPARE
statement is not accessible to another user
2. The cached PREPARE
statement doesn't persist over multiple connections
My understanding is PREPARE
statements only exist over the lifetime of the connection, so these findings aren't surprising, but leave me back at my original problem. I could have a series of PREPARE
statements executed when the user logs in for the first time on the website such that these are available, but that doesn't seem sustainable to me.
Is there a way to have query plans (ideally a PREPARE
statement of plpgsql
function) cached over any connection such that the first time a user logs into the website and queries data using the optimized plan on the first attempt? I keep running into this issue where the first query on a new connection is always (frustratingly) slow.
Second, from a higher level is there a better approach here to optimizing website querying performance where the data sits on top of a PostgresSQL database?
**EDIT / UPDATE**
This is the current performance I am receiving between the first and second run. First, using a plpgsql
function:
Run 1: QUERY PLAN
Planning Time: 0.084 ms
Execution Time: 1411.143 ms
Run 2: QUERY PLAN
Planning Time: 0.028 ms
Execution Time: 116.966 ms
When I use a PREPARE
statement:
Run 1: QUERY PLAN
Planning Time: 26.579 ms
Execution Time: 232.347 ms
Run 2: QUERY PLAN
Planning Time: 16.594 ms
Execution Time: 77.716 ms
The difference between the two approach is likely due to some of the tables already being in memory as J.D. mentioned, which can be extended to the behavior between the 1st and 2nd query run.
Ideally, this is run at the high end speed I am seeing on the 2nd runs < 100ms in total time between execution and planning.
Asked by ctpickard157
(11 rep)
Dec 7, 2021, 12:55 AM
Last activity: Apr 25, 2025, 06:03 PM
Last activity: Apr 25, 2025, 06:03 PM