pg_cron is dramatically faster than running the same query manually
0
votes
0
answers
565
views
I have a simple table with 75 million rows (and roughly 47 million unique ref_ids)
CREATE TABLE tbl (
id serial NOT NULL PRIMARY KEY,
ref_id text NOT NULL,
ts timestamp NOT NULL
);
And I am trying to run the following query on the table (essentially, delete rows where there is an existing row with the same ref_id
and a newer ts
:
DELETE FROM tbl t
WHERE EXISTS (
SELECT 1 FROM tbl t2 WHERE t.id < t2.id AND t.ref_id = t2.ref_id AND t.ts <= t2.ts
)
If I try to run this query manually (i.e. via PGAdmin), it will run for several hours before I give up on it. However, if I put this query in a procedure and schedule it to run 5 minutes later via pg_cron, it will complete in ~70 seconds. There is no traffic to this table during either the manual or cron runs.
To confirm, a count of the rows after the pg_cron
run shows 47 million rows, where before the run it had 75.
I have also experienced similar behavior when running other queries, and I am trying to figure out what could be causing such a massive performance difference between pg_cron and manually running queries.
I am running postgres 14.4.
Asked by perennial_
(101 rep)
Jun 6, 2023, 03:32 AM