Sample Header Ad - 728x90

Database Administrators

Q&A for database professionals who wish to improve their database skills

Latest Questions

0 votes
1 answers
67 views
Cannot import a database dump on Postgres 13.14+ while it loads fine in Postgres 13.13
I'm experiencing a problem with loading a PostgreSQL backup file (SQL format). The SQL file has a function that is defined after another function where it's used. PostgreSQL 13.13 can handle such a backup file, while PostgreSQL 13.14 fails to load it: ``` ERROR: function public.label_id_constant() d...
I'm experiencing a problem with loading a PostgreSQL backup file (SQL format). The SQL file has a function that is defined after another function where it's used. PostgreSQL 13.13 can handle such a backup file, while PostgreSQL 13.14 fails to load it:
ERROR:  function public.label_id_constant() does not exist
LINE 1:  SELECT public.uuid_increment($1, public.label_id_constant()...
                                          ^
HINT:  No function matches the given name and argument types. You might need to add explicit type casts.
QUERY:   SELECT public.uuid_increment($1, public.label_id_constant()) 
CONTEXT:  SQL function "label_id" during inlining
I've double-checked if there is SET check_function_bodies = false; in the dump file. I also searched if I could disable the inlining during the dump load, but still no success. I've distilled the dump file into a minimal reproducible example and attached it as a script to this ticket. If anybody experienced anything similar, please help.
#!/bin/env bash

DUMP_FILE=$(mktemp)
trap "rm -f $DUMP_FILE" EXIT

cat - > "$DUMP_FILE" &2

docker run -d \
       --name postgres-13.13 \
       -e POSTGRES_HOST_AUTH_METHOD=trust \
       -p 5432:5432 \
       postgres:13.13


echo "Waiting for postgres to start" >&2
while ! docker exec postgres-13.13 pg_isready -h localhost -U postgres -q; do
    sleep 1
done

cat "$DUMP_FILE" | psql -h localhost -U postgres -v ON_ERROR_STOP=1 --port 5432 -e -1 && echo "******** Success ********" || echo "******** Failure ********"


docker stop postgres-13.13
docker rm postgres-13.13

echo "Testing with postgres 13.14" >&2

docker run -d \
       --name postgres-13.14 \
       -e POSTGRES_HOST_AUTH_METHOD=trust \
       -p 5432:5432 \
       postgres:13.14

echo "Waiting for postgres to start" >&2
while ! docker exec postgres-13.14 pg_isready -h localhost -U postgres -q; do
    sleep 1
done

cat "$DUMP_FILE" | psql -h localhost -U postgres -v ON_ERROR_STOP=1 --port 5432 -e -1 && echo "******** Success ********" || echo "******** Failure ********"

docker stop postgres-13.14
docker rm postgres-13.14
-------- UPD: What I've already tried: Setting SET jit = off; doesn't fix the problem. UPD2: 1. I tried exporting our database using pg_dump, instead of the CloudSQL export API. It gave me the same error. 2. I tried to export the database, load it to 13.13, then export it from 13.13 and load it to 13.14, but the error was the same again. --- UPD: I successfully migrated the DB with the following script: https://paste.ubuntu.com/p/kgGGQzNcgp/ After migrating to PostgreSQL 17.5, the issue persists. If I dump the DB with pg_dump, I cannot load it with the same error.
Renat (101 rep)
Jan 21, 2025, 08:12 AM • Last activity: Aug 5, 2025, 01:46 PM
1 votes
1 answers
2188 views
Got message: FATAL: expected SASL response, got message type 88
I have a strange situation. A small program working fine till these days (about half a year). The program (NodeJS) is creating a local connection to DB: 1st: creating new connection to local db. 2nd: client by websocket connected to this program, and program creating a new connection for client. In...
I have a strange situation. A small program working fine till these days (about half a year). The program (NodeJS) is creating a local connection to DB: 1st: creating new connection to local db. 2nd: client by websocket connected to this program, and program creating a new connection for client. In NodeJS both connections creating in same way, like: 1st: const client=new PgClient(connection string) 2nd: const networkClient=new PgClient(connection string) Then now the 2nd connection is giving this error to Postgresql logs - FATAL: expected SASL response, got message type 88. While the 1st connection is working fine. Before everything was working fine. What the reason can be? I've googled, but strange that exactly for such error message there are no results. **UPD:** It seem I found the reason, it's in the code. The connect method requires await.
noszone (111 rep)
Aug 2, 2022, 04:20 AM • Last activity: Jul 30, 2025, 01:04 PM
6 votes
1 answers
1035 views
PostgreSQL predicate not pushed down (through join conditions)
Consider the following data model in a PostgreSQL v13 system; [![parent-child data model][1]][1] Here, parent table `dim` contains a small set of reference data, and child table `fact` contains a much higher volume of records. A typical use case for these data sets would be to query all `fact::value...
Consider the following data model in a PostgreSQL v13 system; parent-child data model Here, parent table dim contains a small set of reference data, and child table fact contains a much higher volume of records. A typical use case for these data sets would be to query all fact::value's data belonging to a dim::name. Note that dim::name holds a UNIQUE constraint. While I think this is a very common scenario, I was somewhat taken aback that the style of queries I've been using for years on other RDBMS's (Oracle, MSSQL) didn't perform _at all_ on PostgreSQL the way I imagined they would. That is, when querying a dataset (fact) using a highly selective, but implicit, predicate (fact::dim_id eq X) through a join condition, I expect the index on fact::dim_id to be used (in a nested-loop). Instead, a hash-join is used, requiring a full table scan of fact. **Question:** is there some way I can nudge the query planner into considering any predicate I issue on a joined relation to not need a full table scan? (without impacting other DB loads) To illustrate the problem with an example, these tables are populated with some random data;
CREATE TABLE dim(
  id       SERIAL NOT NULL
, name     TEXT   NOT NULL
, CONSTRAINT pk_dim PRIMARY KEY (id)
, CONSTRAINT uq_dim UNIQUE (name)
);

CREATE TABLE fact(
  id        SERIAL  NOT NULL
, dim_id    INTEGER NOT NULL
, value     TEXT
, CONSTRAINT pk_fact PRIMARY KEY (id)
, CONSTRAINT fk_facts_dim FOREIGN KEY (dim_id) REFERENCES dim (id)
);

CREATE INDEX idx_fact_dim ON fact(dim_id);

INSERT INTO dim(name)
SELECT SUBSTRING(md5(random()::TEXT) FOR 5)
FROM   generate_series(1,50)
UNION
SELECT 'key';

INSERT INTO fact(dim_id, value)
SELECT (SELECT id FROM dim ORDER BY random() LIMIT 1)
,      md5(random()::TEXT)
FROM   generate_series(1,1000000);

ANALYZE dim;
ANALYZE fact;
EXPLAIN ANALYZE
SELECT f.*
FROM   fact AS f
JOIN   dim  AS d
       ON (d.id = f.dim_id)
WHERE  d.name = 'key';       -- Note: UNIQUE

                                                              QUERY PLAN                                                              
--------------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=1001.65..18493.29 rows=20588 width=41) (actual time=319.331..322.582 rows=0 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Hash Join  (cost=1.65..15434.49 rows=8578 width=41) (actual time=306.193..306.195 rows=0 loops=3)
         Hash Cond: (f.dim_id = d.id)
         ->  Parallel Seq Scan on fact f  (cost=0.00..14188.98 rows=437498 width=41) (actual time=0.144..131.050 rows=350000 loops=3)
         ->  Hash  (cost=1.64..1.64 rows=1 width=4) (actual time=0.138..0.139 rows=1 loops=3)
               Buckets: 1024  Batches: 1  Memory Usage: 9kB
               ->  Seq Scan on dim d  (cost=0.00..1.64 rows=1 width=4) (actual time=0.099..0.109 rows=1 loops=3)
                     Filter: (name = 'key'::text)
                     Rows Removed by Filter: 50
 Planning Time: 1.059 ms
 Execution Time: 322.662 ms
Now, we execute the same question, but instead of filtering using an inner join, we filter using a scalar subquery;
EXPLAIN ANALYZE
SELECT *
FROM   fact
WHERE  dim_id = (SELECT id FROM dim WHERE name = 'key');

                                                         QUERY PLAN                                                          
-----------------------------------------------------------------------------------------------------------------------------
 Index Scan using idx_fact_dim on fact  (cost=2.07..15759.53 rows=524998 width=41) (actual time=0.096..0.097 rows=0 loops=1)
   Index Cond: (dim_id = $0)
   InitPlan 1 (returns $0)
     ->  Seq Scan on dim  (cost=0.00..1.64 rows=1 width=4) (actual time=0.046..0.054 rows=1 loops=1)
           Filter: (name = 'key'::text)
           Rows Removed by Filter: 50
 Planning Time: 0.313 ms
 Execution Time: 0.156 ms
As shown, the performance difference is huge. Somehow, the query planner did not consider the predicate on the unique dim::name attribute to be equal to a predicate on fact::dim_id in the first query.
Michiel T (161 rep)
Jan 15, 2021, 10:55 PM • Last activity: Jul 25, 2025, 03:10 AM
1 votes
1 answers
158 views
Postgres 13 Sort Performance
Sort performance on one of our query is very bad that it takes up to 14 seconds to run the query. Here is the Query: ``` SELECT "stock_quant".id FROM "stock_quant" WHERE ((((("stock_quant"."qty" > 0.0) AND "stock_quant"."reservation_id" IS NULL ) AND ("stock_quant"."location_id" in (34))) AND ("stoc...
Sort performance on one of our query is very bad that it takes up to 14 seconds to run the query. Here is the Query:
SELECT "stock_quant".id 
FROM "stock_quant" 
WHERE ((((("stock_quant"."qty" > 0.0)  
      AND "stock_quant"."reservation_id" IS NULL )  
      AND ("stock_quant"."location_id" in (34)))
      AND ("stock_quant"."product_id" = 203330))  
      AND ("stock_quant"."company_id" = 5)) 
ORDER BY "stock_quant"."in_date" ,"stock_quant"."id"   
limit 10;
When used Explain, this is what postgres says explain (analyze,buffers) SELECT "stock_quant".id FROM "stock_quant" WHERE ((((("stock_quant"."qty" > 0.0) AND "stock_quant"."reservation_id" IS NULL ) AND ("stock_quant"."location_id" in (34))) AND ("stock_quant"."product_id" = 203330)) AND ("stock_quant"."company_id" = 5)) ORDER BY "stock_quant"."in_date" ,"stock_quant"."id" limit 10;
Limit  (cost=0.56..4723.78 rows=10 width=12) (actual time=15754.259..15754.260 rows=0 loops=1)
    Buffers: shared hit=9988201 read=94226
    ->  Index Scan using stock_quant_multisort_idx on stock_quant  (cost=0.56..1923768.25 rows=4073 width=12) (actual time=15754.257..15754.257 rows=0 loops=1)
                 Filter: ((reservation_id IS NULL) AND (qty > '0'::double precision) AND (location_id = 34) AND (product_id = 203330) AND (company_id = 5))
                 Rows Removed by Filter: 24052667
                 Buffers: shared hit=9988201 read=94226  
Planning Time: 0.291 ms  
Execution Time: 15754.288 ms 
(8 rows)
explain SELECT "stock_quant".id FROM "stock_quant" WHERE ((((("stock_quant"."qty" > 0.0) AND "stock_quant"."reservation_id" IS NULL ) AND ("stock_quant"."location_id" in (34))) AND ("stock_quant"."product_id" = 203330)) AND ("stock_quant"."company_id" = 5)) ORDER BY "stock_quant"."in_date" ,"stock_quant"."id" limit 10;
Limit  (cost=0.56..4723.82 rows=10 width=12)
    ->  Index Scan using stock_quant_multisort_idx on stock_quant  (cost=0.56..1923781.40 rows=4073 width=12)
                 Filter: ((reservation_id IS NULL) AND (qty > '0'::double precision) AND (location_id = 34) AND (product_id = 203330) AND (company_id = 5)) (3 rows)
And here are the indexes in the table: "stock_quant_pkey" PRIMARY KEY, btree (id) "stock_quant_company_id_index" btree (company_id) "stock_quant_location_id_index" btree (location_id) "stock_quant_lot_id_index" btree (lot_id) "stock_quant_multisort_idx" btree (in_date, id) "stock_quant_owner_id_index" btree (owner_id) "stock_quant_package_id_index" btree (package_id) "stock_quant_product_id_index" btree (product_id) "stock_quant_product_location_index" btree (product_id, location_id, company_id, qty, in_date, reservation_id) "stock_quant_propagated_from_id_index" btree (propagated_from_id) "stock_quant_qty_index" btree (qty) "stock_quant_reservation_id_index" btree (reservation_id) Work Mem is set at **512MB** Any idea what needs to be changed? Without sort, the same query executes in less than 200ms. Update: Explain Analyze without Order by explain (analyze,buffers) SELECT "stock_quant".id FROM "stock_quant" WHERE ((((("stock_quant"."qty" > 0.0) AND "stock_quant"."reservation_id" IS NULL ) AND ("stock_quant"."location_id" in (34))) AND ("stock_quant"."product_id" = 203330)) AND ("stock_quant"."company_id" = 5)) limit 10; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.56..33.76 rows=10 width=4) (actual time=0.661..0.662 rows=0 loops=1) Buffers: shared hit=2 read=2 -> Index Scan using stock_quant_product_location_index on stock_quant (cost=0.56..13524.04 rows=4074 width=4) (actual time=0.660..0.660 rows=0 loops=1) Index Cond: ((product_id = 203330) AND (location_id = 34) AND (company_id = 5) AND (qty > '0'::double precision) AND (reservation_id IS NULL)) Buffers: shared hit=2 read=2 Planning: Buffers: shared hit=248 read=16 Planning Time: 7.005 ms Execution Time: 0.691 ms
Abul Hassan (11 rep)
Oct 27, 2021, 03:56 AM • Last activity: Jul 12, 2025, 06:04 AM
0 votes
1 answers
157 views
Is there any possibility to delete all data directory of PostgreSQL by PgPool?
Recently I found that suddenly all data directories were deleted both primary and standby. I am assuming that it happens due to failover. Example: Suppose there are 4 nodes where 1 is primary and another 3 is standby. For example, suppose 1 and 2 at a time are Primary due to failover. Both primaries...
Recently I found that suddenly all data directories were deleted both primary and standby. I am assuming that it happens due to failover. Example: Suppose there are 4 nodes where 1 is primary and another 3 is standby. For example, suppose 1 and 2 at a time are Primary due to failover. Both primaries are somehow in active mode with the other two on standby. Is it possible by pgpool to delete all(4 nodes) data directories of PostgreSQL primary and standby? Need expert opinion.
Sheikh Wasiu Al Hasib (283 rep)
Sep 14, 2023, 06:24 PM • Last activity: Jul 11, 2025, 01:03 PM
0 votes
2 answers
88 views
What will be the root cause for the incident?
I’m facing a strange issue with PostgreSQL query performance. The same query runs in under 1 second during certain periods, but takes around 20 seconds at other times. However, the data volume and other parameters remain the same across both time periods. If a vacuum has been performed on the table,...
I’m facing a strange issue with PostgreSQL query performance. The same query runs in under 1 second during certain periods, but takes around 20 seconds at other times. However, the data volume and other parameters remain the same across both time periods. If a vacuum has been performed on the table, I would expect consistent performance throughout the day. But in this case, the query runs in 1 second during one half of the day and 20 seconds during the other half. How can this be handled? What could be the root cause of this behavior?
Suruthi Sundararajan (9 rep)
Jun 20, 2025, 09:53 AM • Last activity: Jul 3, 2025, 08:22 AM
0 votes
1 answers
1140 views
How to conditionally recreate indexes created by `CREATE INDEX ON <table_name> (<column_name>)`?
I am squashing migrations and rewriting all constructs like: ```postgresql CREATE TABLE entities ( id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY, title text NOT NULL, description text ); CREATE INDEX ON entities (title); CREATE INDEX ON entities (description); ``` Into: ```postgresql CREATE TAB...
I am squashing migrations and rewriting all constructs like:
CREATE TABLE entities (
  id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  title text NOT NULL,
  description text
);

CREATE INDEX ON entities (title);
CREATE INDEX ON entities (description);
Into:
CREATE TABLE IF NOT EXISTS entities (
  id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  title text NOT NULL,
  description text
);

CREATE INDEX IF NOT EXISTS ON entities (title);
CREATE INDEX IF NOT EXISTS ON entities (description);
When I run the squash I get a syntax error on the CREATE INDEX IF NOT EXISTS ON entities (title); line. After squinting harder at the [docs reference](https://www.postgresql.org/docs/13/sql-createindex.html) I noticed IF NOT EXISTS does in fact require name after it.
So what is the right way to conditionally recreate "default" indexes like this?
Biller Builder (288 rep)
Aug 10, 2023, 02:23 PM • Last activity: Jun 24, 2025, 06:08 AM
1 votes
1 answers
216 views
Selecting related data without selecting foreign key from subquery
cities | id | name | county_id | |----|:------:|----------:| | 1 | City1 | 1 | | 2 | City2 | 1 | | 3 | City3 | 2 | | 4 | City4 | 2 | counties | id | name | |----|:-------:| | 1 | County1 | | 2 | County2 | Hello, I would like to select all city names from related county by enetring a city name. I can...
cities | id | name | county_id | |----|:------:|----------:| | 1 | City1 | 1 | | 2 | City2 | 1 | | 3 | City3 | 2 | | 4 | City4 | 2 | counties | id | name | |----|:-------:| | 1 | County1 | | 2 | County2 | Hello,
I would like to select all city names from related county by enetring a city name.
I can do it via subquery, but is there any other way to achieve this?
SELECT name
FROM cities
WHERE cities.county_id =
      (SELECT county_id
       FROM cities
       WHERE name = 'City1'
       LIMIT 1);

result => "City1, City2"
Thank you.
rjeremy (13 rep)
Sep 9, 2021, 12:08 PM • Last activity: Jun 20, 2025, 10:51 AM
1 votes
1 answers
67 views
PostgreSQL 13 on Windows – “invalid page in block 0 of relation pg_index” – How to recover?
I'm facing a serious issue with my PostgreSQL 13 instance on Windows. I'm unable to open a database in pgAdmin and receive the following error: ERROR: **invalid page in block 0 of relation base/16394/2610** After investigation, I found that this OID (2610) corresponds to the critical system catalog...
I'm facing a serious issue with my PostgreSQL 13 instance on Windows. I'm unable to open a database in pgAdmin and receive the following error: ERROR: **invalid page in block 0 of relation base/16394/2610** After investigation, I found that this OID (2610) corresponds to the critical system catalog **pg_index**. From what I understand, this catalog is essential for managing all indexes, so corruption here is catastrophic. My Setup PostgreSQL Version: 13 OS: Windows 10 Symptoms pgAdmin fails to open the database psql throws an error or hangs when trying to list relations Any attempt to REINDEX crashes due to pg_index corruption Running pg_dump on the entire database fails early What I've Tried Identified the corrupted relation using oid2name Confirmed the corrupted catalog is pg_index Considered rebuilding DB from scratch using salvaged schema and table data I'd like to know: Is there any safe way to reconstruct or replace pg_index? Can I recover more data using pg_dirtyread or other low-level tools? Should I avoid trying REINDEX SYSTEM given that pg_index is corrupted? Any known method to surgically patch or recreate system catalogs? I do not have a recent physical or logical backup (lesson learned 😔), so any suggestions to extract usable data or rebuild the DB from working fragments would be hugely appreciated. Thanks so much for any guidance you can provide.
Dinesh Kumar (11 rep)
Jun 17, 2025, 01:44 AM • Last activity: Jun 17, 2025, 06:27 AM
1 votes
1 answers
46 views
Postgres Database Free Storage Space Downward
I have a situation on my Postgres 13 db on AWS. * 8TB of storage * 60GB of memory that it isn't really using * I regularly check to see if any query is running and if so, kill it. * Transaction Log Disk Usage in blue, replication slot disk usage in orange. [![Transaction Log Disk Usage][1]][1] * Rep...
I have a situation on my Postgres 13 db on AWS. * 8TB of storage * 60GB of memory that it isn't really using * I regularly check to see if any query is running and if so, kill it. * Transaction Log Disk Usage in blue, replication slot disk usage in orange. Transaction Log Disk Usage * Replication lag is over 500GB behind * FreeStorageSpace is sawtoothing FreeStorageSpace sawtooth * I combed the logs looking for "timeout", "error", on replication publisher and subscriber in hopes that the wal_sender_timeout and wal_receiver_timeout needed adjustment, however I see nothing. * I see the weird LSN behavior where it advances, the machine runs out of space, and then rewinds as if to start all over again.
-- 3851F/6305C2D0 7:46
-- 38521/3829A280 7:50
-- 38535/9FC44768 8:38
-- 38544/8E82F3D8 9:12
-- 3854F/8BD52F00 9:39pm
-- 38504/9EABBE48 6:29am <- rollover
-- 3851D/9413C9D0 6:11pm
* The amount of data sent between the publisher and subscriber is actually very low when looking at the AWS network sent/received charts. * I actually had 2 subscribers at the beginning. I took the smaller one offline since it's easily repairable. That helped with the storage pressure, but only a tiny bit. Questions: * I don't know for certain that it's the replication that's causing the free space to drop. Is there a way I can verify for certain? * I don't think Postgres 13 has parallel replication from what I saw in some documentation. Is there a way to fake it? * Is dropping the subscription, killing the replication slot on the publisher, truncating the tables on the subscriber, and creating a new subscription my only option here? * Since I have at least 2TB free storage when the drop begins, and the replication is ~500GB, I would think that whatever the publisher has to send would fit in storage. Should I increase storage in hopes that it can overcome the problem? I don't have a sense as to how high it should be increased. Can it be scaled back down after the fact? * Are there any suggestions as to keywords I should look for in the log files? Any responses are greatly appreciated.
mj_ (347 rep)
Jun 7, 2025, 11:46 PM • Last activity: Jun 9, 2025, 02:39 PM
0 votes
0 answers
39 views
Why can a SELECT query produce a "permission denied for schema X" error when permission IS granted for that schema?
I have spent a couple hours digging into documentation, online blogs, and the particular permission settings for this DB. My best lead is https://stackoverflow.com/a/28503453/5419599 (more on that later). First, the error message from the Postgres error log, lightly redacted: > my_app_user@my_db:[24...
I have spent a couple hours digging into documentation, online blogs, and the particular permission settings for this DB. My best lead is https://stackoverflow.com/a/28503453/5419599 (more on that later). First, the error message from the Postgres error log, lightly redacted: > my_app_user@my_db::ERROR: permission denied for schema organization at character 417 \dn organization shows, among other things, that the owner is postgres and the permission string my_app_user=U/postgres. The query being run (which does appear in the log) is extremely complicated because it's generated under the hood by some typescript library and has a lot of __local_0__ and __local_3__ aliases, etc. And a lot of calls to to_json and json_build_object and json_build_object. But fundamentally it's just a SELECT query. Here is a simplified version to ignore some of the complexity but keep in the references that may be relevant:
select
  ... ,
  to_json(
    select
      json_build_object(
        ...
      ) as object
    from "organization"."organization" as __local_1__
    where ...
  ) as "@redactedAlias1",
  to_json(
    with __local_2__ as (
      select ...
      from "redacted_schema_name"."custom_function_2"(__local_0__) as __local_3__
      where (TRUE) and (TRUE)
    ), ...
    select
      ...    
  ) as "@redactedAlias2",
  to_json(
    with __local_5__ as (
      select ...
      from "redacted_schema_name"."custom_function_3"(__local_0__) as __local_6__
      where ...
    ), ...
    select ...
  ) as "@redactedAlias3"
from "redacted_schema_name"."custom_function_1"() as __local_0__
where (not (__local_0__ is null)) and (TRUE) and (TRUE)
(Character 417 seems to align with where "organization"."organization" appears.) The custom functions are all SQL functions with "invoker" security (the default). The only possibility I currently know of for where this error might be coming from, is the Stack Overflow answer I linked, which states: > There is a foreign key in a table referring to a table in the schema in question, to which the table owner role does not have access granted. Foreign key checks are done with the permissions of the role that owns the table, not the role performing the query. > > The query is actually doing the internal foreign key check. The organization.organization table in my case does have an awful lot of foreign keys which reference it. However, the table and the schema are both owned by the postgres user, and the postgres user also owns all the other schemas and tables which have references to organization.organization. (Note, the postgres user is NOT the DB superuser but for most purposes might as well be.) I was trying to dig up documentation to confirm the above linked answer, i.e. to confirm what permissions are used when checking foreign key constraints and when this check is done, and I was unable to find any. What could be causing this schema permission error? And, if foreign key constraints could indeed be relevant here, where is the documentation about how they are checked or how table owner permissions relate to foreign key constraint checking?
Wildcard (587 rep)
Jun 4, 2025, 03:23 AM
3 votes
1 answers
276 views
How to "merge" rows along with their foreign many-to-many relations without violating unique constraints?
Fiddle: https://dbfiddle.uk/-JLFuIrN ## Table ```postgresql CREATE TABLE files ( id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY, name text ); CREATE TABLE folders ( id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY, name text ); CREATE TABLE file_folders ( id bigint GENERATED ALWAYS AS IDENTITY...
Fiddle: https://dbfiddle.uk/-JLFuIrN ## Table
CREATE TABLE files (
  id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  name text
);

CREATE TABLE folders (
  id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  name text
);

CREATE TABLE file_folders (
  id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  file_id bigint NOT NULL REFERENCES files,
  folder_id bigint NOT NULL REFERENCES folders,
  UNIQUE (file_id, folder_id)
);
## Query
/*
  Merges
*/

WITH targets AS (
  SELECT 
    ARRAY (
      SELECT
        id
      FROM
        folders TABLESAMPLE BERNOULLI (50)
      LIMIT 3
    ) AS folders
),
-- basically a setup to ensure unique target/folder pairs
-- and no targets in the merges
input_folders AS (
  SELECT
    folders.id AS folder_id,
    random_array_element(targets.folders) AS target_id
  FROM
    folders
    CROSS JOIN
    targets
  WHERE
    NOT ( 
      folders.id = ANY (targets.folders)
    ) 
),
input_files AS (
  SELECT
    file_folders.id,
    file_folders.folder_id,
    file_folders.file_id,
    input_folders.target_id
  FROM
    input_folders
    INNER JOIN
    file_folders
    ON
      input_folders.folder_id = file_folders.folder_id
      OR
      input_folders.target_id = file_folders.folder_id
),
deleted_files AS (
    WITH deletions AS (
    SELECT
      inputs.id
    FROM
      input_files AS inputs
      INNER JOIN
      input_files AS targets
      ON
        NOT (inputs.folder_id = targets.target_id)
        AND
        inputs.file_id = targets.file_id
  )
  DELETE
  FROM
    file_folders
  WHERE
    id IN (
      SELECT
        id
      FROM
        deletions
    )
),
merged_files AS (
  WITH merges AS (
    SELECT
      inputs.id,
      inputs.folder_id,
      inputs.target_id
    FROM
      input_files AS inputs
      INNER JOIN
      input_files AS targets
      ON
        NOT (inputs.folder_id = targets.target_id)
        AND
        NOT (inputs.file_id = targets.file_id)
  )
  UPDATE file_folders
  SET
    folder_id = merges.target_id
  FROM
    merges
  WHERE
    merges.id = file_folders.id
),
deleted_folders AS (
  DELETE
  FROM
    folders
  WHERE
    id IN (
      SELECT DISTINCT
        folder_id
      FROM
        input_folders
    )
)
SELECT
  folders AS targets
FROM
  targets
;
## Inputs The array-transforming setup is me trying to replicate the JSON input of the application in pure SQL. The input looks like this:
interface IQueryInput extends Array {};

interface IMergeInput {
  target: IEntityID;
  inputs: IEntityID[];
};

// postgresql bigints are treated as strings in the application
type IEntityID = string;
So the prepping query from above can be replaced with:
WITH inputs AS (
  SELECT
    input.*
  FROM
    -- the application interpolates JSON there
    json_to_recordset($inputs$$inputs$) AS input(
      target bigint,
      inputs bigint[]
    )
),
input_folders AS (
  SELECT
    inputs.target AS target_id,
    merge.folder_id
  FROM
    inputs
    CROSS JOIN
    UNNEST(inputs.inputs) AS merge(
      folder_id
    )
)
It must run as a batch operation, so the application provides these guaranties for the query input: - all target values are unique. - all inputs concatenated result in unique values. - target values do not intersect with concatenated inputs. Therefore input_folders always ends up as unique target_id-folder_id pairs. The query is ran as a background task, so the speed and memory are of secondary importance. The main requirement is of a typical transaction: either it should go through completely on success or reject completely on any error. ## The problem I want to "merge" several folders into a single folder. So given a target_id and an array of folder_ids, replace all foreign references to folder_ids with target_id and remove non-target folder afterwards.
This however becomes an issue in relations table with unique constraints, since after updating the references there are duplicates.
So I went this path: 1. Select all relation rows related to the query, so all file_folders with target_ids and folder_ids in them. 2. Separate them into two categories: - Deletes - the rows which will result in dupes when updated. - Merges - the rows which will not result in. 3. Delete the delete candidates. 4. Update the merge candidates. 5. Repeat previous 4 steps for all relations. 6. Delete rows in folders with folder_ids. However I still stumble upon unique key violation error.
"Merge" is in quotes because it doesn't look like what I am trying to do can be accomplished by [merge in docs](https://www.postgresql.org/docs/15/sql-merge.html) and it requires a newer version of postgresql anyway.
Biller Builder (288 rep)
Jan 13, 2023, 02:47 PM • Last activity: May 17, 2025, 05:08 PM
0 votes
1 answers
384 views
Does long running query cause replication delay in PostgreSQL which is in 'idle in transaction' state?
In my system I found same query called more than 30 times '**idle in transaction**' state. Is there any way to to identify the reason of 'idle in transaction'? Should I terminate those query manually using pg_terminate_backend() or set **idle_in_transaction_session_timeout to 5min** in case of high...
In my system I found same query called more than 30 times '**idle in transaction**' state. Is there any way to to identify the reason of 'idle in transaction'? Should I terminate those query manually using pg_terminate_backend() or set **idle_in_transaction_session_timeout to 5min** in case of high TPS DB. I think there is no meaning to ideal any query more than 24hr. But there I found it is more than 7 days. Query below: > SELECT current_setting('transaction_isolation'); > Query Output: read committed How to debug in those type of issue to identify actual cause of 'idle in transaction'?
Sheikh Wasiu Al Hasib (283 rep)
Apr 17, 2024, 09:37 AM • Last activity: May 2, 2025, 08:02 PM
2 votes
2 answers
127 views
PostgreSQL: Finding last executed statement from an "idle in transaction timeout"
Is there any way to log or find last executed statement from a session that terminated because of `idle in transaction timeout`? We only have `slow statement logging` and that did not capture it and we don't want to enable `all statement logging` as well as this will have bad impact on performance.
Is there any way to log or find last executed statement from a session that terminated because of idle in transaction timeout? We only have slow statement logging and that did not capture it and we don't want to enable all statement logging as well as this will have bad impact on performance.
goodfella (595 rep)
Apr 30, 2025, 03:45 AM • Last activity: May 2, 2025, 04:12 AM
2 votes
1 answers
66 views
PG 13 write performance deteriorates after OS Upgrade Win Server 2012r2 to 2019 with LWLock WALWrite statuses
I've upgraded the OS on PG database server from Win Server 2012r2 Standard to 2019 Standard. Nothing else was changed. Now I see write throughput deteriorates greatly during peak update times with multiple connections in "LWLock WALWrite" status. Config: Postgresql 13.18 Hardware: Dell R740xd 768GB...
I've upgraded the OS on PG database server from Win Server 2012r2 Standard to 2019 Standard. Nothing else was changed. Now I see write throughput deteriorates greatly during peak update times with multiple connections in "LWLock WALWrite" status. Config: Postgresql 13.18 Hardware: Dell R740xd 768GB RAM 2 x 28 core CPU (so 56 cores, 112 threads), SSD RAID 10 storage. Previous to the upgrade the server ran well with no problems. I don't believe that I ever saw the connection status "LWLock WALWrite" under win 2012r2. During the problem episodes, CPU usage halves suddenly for a few seconds, then returns, then repeats - almost looks like a square wave. Removing writers clears things up. In general, CPU usage does not exceed 35% Reviewing the changelog for lwlock.c, appears that performance enhancements were made in Rev 13.14 by Andres Freund and also this change by Michael Paquier which does not appear to have made it into the PG 13 branch. Questions: 1. Given my ability to understand the PG source code is limited, does lwlock.c depend on an operating system primitive that could have been affected by the OS upgrade? In other words, is there an explanation for this issue that can be tied to the OS upgrade? 2. Is there anything I can try to mitigate this issue?
sevzas (373 rep)
Apr 30, 2025, 12:31 PM • Last activity: May 1, 2025, 07:14 AM
2 votes
0 answers
37 views
PostgreSQL: How much memory should I allocate as ramdisk for "pg_stat_tmp"?
We have been facing a problem in an `Azure VM` running `PostgreSQL13` where a recurring issue of `disk availability` causing `statistics collector process` to stall for few seconds followed by slow queries. After searching few threads I have found that changing `pg_stat_tmp` to `ramdisk` could resol...
We have been facing a problem in an Azure VM running PostgreSQL13 where a recurring issue of disk availability causing statistics collector process to stall for few seconds followed by slow queries. After searching few threads I have found that changing pg_stat_tmp to ramdisk could resolve this and maybe improve performance as well. What would be the best size to allocate ramdisk? Is there any way to calculate? What could possibly happen if collector run out of this allocated memory?
goodfella (595 rep)
Apr 28, 2025, 04:45 AM
1 votes
1 answers
2451 views
How to recover Postgresq data from $PGDATA/base files after system crash
I was working on my Ubuntu machine and left it on to grab food. Upon returning, I found it had shut itself down. It was not power failture because I have a home backup system and lights never go off in my house. When I turned it on, it went straight to Busybox because apparently the `fsck` tool had...
I was working on my Ubuntu machine and left it on to grab food. Upon returning, I found it had shut itself down. It was not power failture because I have a home backup system and lights never go off in my house. When I turned it on, it went straight to Busybox because apparently the fsck tool had somehow moved the entire contents of the root partition and everything else to **lost+found** on my laptop's primary drive, which is an SSD. So I backed up the lost+found directory to an external HDD and installed Ubuntu 22, from Ubuntu 20. I did not lose personal data because I had my /home directory on my secondary HDD. However, everything else was on the same partition on the SSD. So, after perusing through the lost+found files, I was able to extract all the files from /lib/postgresql/. Now because Postgresql uses OIDS, unlike mysql which uses names, I had to figure out the databases based on the information on this website here https://www.postgresql.fastware.com/blog/how-postgresql-maps-your-tables-into-physical-files For reference, I was able to recover MySQL tables since they simply use table names. With PostgreSQL however, I'm not sure if just copying the **$PGDATA/base/** files will work given the information mentioned on that website. Is it possible to extract the data (through some tool) from the base files or to re-import them back into an active cluster/instance?
TheRealChx101 (121 rep)
Jul 31, 2022, 12:20 AM • Last activity: Mar 22, 2025, 01:00 PM
1 votes
1 answers
117 views
Replay standby WAL to point in time without creating new timeline
We have a PostgreSQL 13 replication cluster where one of the standbys is setup with a `recovery_min_apply_delay`="x hr", which could be useful in the case of data corruption. For a scenario where an accidental delete happened on the *master server*, but not yet applied to the *standby server*. In th...
We have a PostgreSQL 13 replication cluster where one of the standbys is setup with a recovery_min_apply_delay="x hr", which could be useful in the case of data corruption. For a scenario where an accidental delete happened on the *master server*, but not yet applied to the *standby server*. In this case, I am trying to extract the data from standby server by removing the *WAL replay delay* and adding recovery_target_time to a point before delete happened. To this point I am successful, but once I done with pulling data from standby server to master (I am thinking of postgres_fdw), I need to resume replication. But when I run pg_wal_replay_resume(), it is creating a new timeline and is no longer in the replication cluster. Is there any way to replay WAL other than using recovery_target_time? Giving an example of what I am trying to achieve, *Let's say recovery_min_apply_delay=2hr , an accidental delete happened at 03:50 AM, at this time transactions committed at standby is transactions that happened on or before 01:50 AM. If I wait another 2hr and at 05:49 AM if I am able to pause standby WAL replay I will be able see a snapshot of data which is "right before" DELETE. Then I will use postgres_fdw to pull data from standby.* How do I achieve "right before" in a precise systematic way? Also I would be able to resume WAL replay.
goodfella (595 rep)
Feb 17, 2025, 04:47 AM • Last activity: Feb 18, 2025, 09:56 AM
0 votes
1 answers
999 views
How to organize postgresql databases in pgadmin4?
I'm working with postgresql on my local machine with Windows 10, in pgadmin4 I can create new Server Groups and new Servers, if I create 2 new server groups and select localhost for the host, I get all the databases I've created on my computer. Is there any way to split all the databases I create on...
I'm working with postgresql on my local machine with Windows 10, in pgadmin4 I can create new Server Groups and new Servers, if I create 2 new server groups and select localhost for the host, I get all the databases I've created on my computer. Is there any way to split all the databases I create on my local machine into separate groups?
DumbMathBoy (1 rep)
Apr 30, 2022, 08:35 PM • Last activity: Jan 19, 2025, 06:02 PM
1 votes
1 answers
568 views
Parallel index-only scan with two sub-queries is slow
I'm trying to analyze a slow query that uses parallel index-only scan of two filters (using sub-queries). Specifically I'm the number of 'open_questions' that returned by either subplan1 OR subplan2. The query itself is generated by Django ORM, so I can definitely see that there are places I can opt...
I'm trying to analyze a slow query that uses parallel index-only scan of two filters (using sub-queries). Specifically I'm the number of 'open_questions' that returned by either subplan1 OR subplan2. The query itself is generated by Django ORM, so I can definitely see that there are places I can optimize it (doing the sub-sub query of U0."final_answer" IN ('1st Option') and U0."final_answer" IN NULL. However, running explain (analyze, buffers) showed me that the culprit of the slowness is the Parallel Index Only Scan using primary key of open_question. It was over-estimated the returned ids by 19M+ records (the table size of open_question is roughly 19.5M rows)
SELECT
    COUNT(*) AS "__count"
FROM
    "open_question"
WHERE
    (
        "open_question"."id" IN (
            SELECT
                DISTINCT U0."id"
            FROM
                "open_question" U0
                INNER JOIN "book" U1 ON (U0."book_id" = U1."id")
                INNER JOIN "picture" U3 ON (U0."picture_id" = U3."id")
            WHERE
                (
                    NOT (
                        U0."final_answer" IN (
                            'DELETED',
                            'DELETED_NOT_ANSWERED',
                            'NO_ONE_ANSWER'
                        )
                        AND U0."final_answer" IS NOT NULL
                    )
                    AND U1."project_id" = '107e827e-346a-4178-bb53-7cbd2ff0d66c'::uuid
                    AND U3."picture_group_id" IN ('ff3d7383-f086-499d-b59c-09d49733e327'::uuid)
                    AND U3."color_schema_id" IN ('2ee66308-1b4b-4473-b2b8-717974ccbfdf'::uuid)
                    AND U0."book_id" IN (
                        'ef4ad3c8-8ff5-411b-8577-202212202103'::uuid,
                        '671d56b3-41f1-450b-afbf-202212191825'::uuid,
                        'c986ce5e-36ad-44d0-a2ab-202212132131'::uuid,
                        ... 40 more records
                    )
                    AND (U0."final_answer" IN ('1st Option') 
                )
        )
        OR "open_question"."id" IN (
            SELECT
                V0."open_question_id"
            FROM
                "scoped_question" V0
            WHERE
                (
                    V0."date_submitted" IS NULL
                    AND V0."date_answered" IS NULL
                    AND V0."open_question_id" IN (
                        SELECT
                            DISTINCT U0."id"
                        FROM
                            "open_question" U0
                            INNER JOIN "book" U1 ON (U0."book_id" = U1."id")
                            INNER JOIN "picture" U3 ON (U0."picture_id" = U3."id")
                        WHERE
                            (
                                NOT (
                                    U0."final_answer" IN (
                                        'DELETED',
                                        'DELETED_NOT_ANSWERED',
                                        'NO_ONE_ANSWER'
                                    )
                                    AND U0."final_answer" IS NOT NULL
                                )
                                AND U1."project_id" = '107e827e-346a-4178-bb53-7cbd2ff0d66c'::uuid
                                AND U3."picture_group_id" IN ('ff3d7383-f086-499d-b59c-09d49733e327'::uuid)
                                AND U3."color_schema_id" IN ('2ee66308-1b4b-4473-b2b8-717974ccbfdf'::uuid)
                                AND U0."book_id" IN (
                                    'ef4ad3c8-8ff5-411b-8577-202212202103'::uuid,
                                    '671d56b3-41f1-450b-afbf-202212191825'::uuid,
                                    'c986ce5e-36ad-44d0-a2ab-202212132131'::uuid,
                                    ... 40 more records as above

                                )
                                AND U0."final_answer" IS NULL
                            )
                    )
                    AND V0."type" IN ('REVALIDATE_ANSWER')
                )
        )
    );
Running each of the sub-plans individually is very fast, and eventually the result of all this long query is 2 (only 2 open_questions).
QUERY PLAN                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=479208.85..479208.86 rows=1 width=8) (actual time=10159.397..10164.554 rows=1 loops=1)
   Buffers: shared hit=18316732 read=77554
   I/O Timings: read=150757.132
   ->  Gather  (cost=479208.63..479208.84 rows=2 width=8) (actual time=10159.041..10164.548 rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         Buffers: shared hit=18316732 read=77554
         I/O Timings: read=150757.132
         ->  Partial Aggregate  (cost=478208.63..478208.64 rows=1 width=8) (actual time=10154.771..10154.782 rows=1 loops=3)
               Buffers: shared hit=18316732 read=77554
               I/O Timings: read=150757.132
               ->  Parallel Index Only Scan using open_question_pkey on open_question  (cost=155.73..463070.55 rows=6055232 width=0) (actual time=5538.715..10154.776 rows=1 loops=3)
                     Filter: ((hashed SubPlan 1) OR (hashed SubPlan 2))
                     Rows Removed by Filter: 6471867
                     Heap Fetches: 4463629
                     Buffers: shared hit=18316732 read=77554
                     I/O Timings: read=150757.132
                     SubPlan 1
                       ->  Unique  (cost=74.48..74.49 rows=1 width=16) (actual time=330.591..330.599 rows=2 loops=3)
                             Buffers: shared hit=4993 read=595
                             I/O Timings: read=568.918
                             ->  Sort  (cost=74.48..74.49 rows=1 width=16) (actual time=330.590..330.595 rows=2 loops=3)
                                   Sort Key: u0.id
                                   Sort Method: quicksort  Memory: 25kB
                                   Buffers: shared hit=4993 read=595
                                   I/O Timings: read=568.918
                                   Worker 0:  Sort Method: quicksort  Memory: 25kB
                                   Worker 1:  Sort Method: quicksort  Memory: 25kB
                                   ->  Nested Loop  (cost=34.74..74.47 rows=1 width=16) (actual time=207.845..329.926 rows=2 loops=3)
                                         Buffers: shared hit=4976 read=595
                                         I/O Timings: read=565.144
                                         ->  Nested Loop  (cost=34.45..71.40 rows=1 width=32) (actual time=207.823..329.890 rows=2 loops=3)
                                               Buffers: shared hit=4956 read=595
                                               I/O Timings: read=565.144
                                               ->  Bitmap Heap Scan on picture u3  (cost=33.89..35.37 rows=1 width=16) (actual time=2.851..3.235 rows=118 loops=3)
                                                     Recheck Cond: ((picture_group_id = 'ff3d7383-f086-499d-b59c-09d49733e327'::uuid) AND (color_schema_id = '2ee66308-1b4b-4473-b2b8-717974ccbfdf'::uuid))
                                                     Heap Blocks: exact=118
                                                     Buffers: shared hit=411 read=7
                                                     I/O Timings: read=6.323
                                                     ->  BitmapAnd  (cost=33.89..33.89 rows=1 width=0) (actual time=2.830..2.831 rows=0 loops=3)
                                                           Buffers: shared hit=57 read=7
                                                           I/O Timings: read=6.323
                                                           ->  Bitmap Index Scan on picture_picture_group_id_ea80b794  (cost=0.00..12.02 rows=549 width=0) (actual time=2.221..2.221 rows=792 loops=3)
                                                                 Index Cond: (picture_group_id = 'ff3d7383-f086-499d-b59c-09d49733e327'::uuid)
                                                                 Buffers: shared hit=28 read=7
                                                                 I/O Timings: read=6.323
                                                           ->  Bitmap Index Scan on color_schema_id_c798104e  (cost=0.00..21.62 rows=2042 width=0) (actual time=0.567..0.568 rows=4973 loops=3)
                                                                 Index Cond: (color_schema_id = '2ee66308-1b4b-4473-b2b8-717974ccbfdf'::uuid)
                                                                 Buffers: shared hit=29
                                               ->  Index Scan using open_question_id_ff531421 on open_question u0  (cost=0.56..36.02 rows=1 width=48) (actual time=2.736..2.765 rows=0 loops=354)
                                                     Index Cond: (picture_id = u3.id)
                                                     Filter: ((final_answer = '1st Option'::text) AND ((final_answer  ALL ('{DELETED,DELETED_NOT_ANSWERED,NO_ONE_ANSWER}'::text[])) OR (final_answer IS NULL)) AND (book_id = ANY ('{ef4ad3c8-8ff5-411b-8577-202212202103,671d56b3-41f1-450b-afbf-202212191825,c986ce5e-36ad-44d0-a2ab-202212132131, ...}'::uuid[])))
                                                     Rows Removed by Filter: 27
                                                     Buffers: shared hit=4545 read=588
                                                     I/O Timings: read=558.821
                                         ->  Index Scan using book_pkey on book u1  (cost=0.29..3.06 rows=1 width=16) (actual time=0.013..0.013 rows=1 loops=6)
                                               Index Cond: (id = u0.book_id)
                                               Filter: (project_id = '107e827e-346a-4178-bb53-7cbd2ff0d66c'::uuid)
                                               Buffers: shared hit=20
                     SubPlan 2
                       ->  Nested Loop  (cost=74.74..80.67 rows=1 width=16) (actual time=6.960..6.965 rows=0 loops=3)
                             Buffers: shared hit=4624
                             ->  Unique  (cost=74.32..74.33 rows=1 width=16) (actual time=6.959..6.964 rows=0 loops=3)
                                   Buffers: shared hit=4624
                                   ->  Sort  (cost=74.32..74.33 rows=1 width=16) (actual time=6.959..6.963 rows=0 loops=3)
                                         Sort Key: u0_1.id
                                         Sort Method: quicksort  Memory: 25kB
                                         Buffers: shared hit=4624
                                         Worker 0:  Sort Method: quicksort  Memory: 25kB
                                         Worker 1:  Sort Method: quicksort  Memory: 25kB
                                         ->  Nested Loop  (cost=34.74..74.31 rows=1 width=16) (actual time=6.950..6.953 rows=0 loops=3)
                                               Buffers: shared hit=4624
                                               ->  Nested Loop  (cost=34.45..71.34 rows=1 width=32) (actual time=6.949..6.952 rows=0 loops=3)
                                                     Buffers: shared hit=4624
                                                     ->  Bitmap Heap Scan on picture u3_1  (cost=33.89..35.37 rows=1 width=16) (actual time=0.686..0.927 rows=118 loops=3)
                                                           Recheck Cond: ((picture_group_id = 'ff3d7383-f086-499d-b59c-09d49733e327'::uuid) AND (color_schema_id = '2ee66308-1b4b-4473-b2b8-717974ccbfdf'::uuid))
                                                           Heap Blocks: exact=118
                                                           Buffers: shared hit=414
                                                           ->  BitmapAnd  (cost=33.89..33.89 rows=1 width=0) (actual time=0.667..0.668 rows=0 loops=3)
                                                                 Buffers: shared hit=60
                                                                 ->  Bitmap Index Scan on picture_picture_group_id_ea80b794  (cost=0.00..12.02 rows=549 width=0) (actual time=0.088..0.089 rows=792 loops=3)
                                                                       Index Cond: (picture_group_id = 'ff3d7383-f086-499d-b59c-09d49733e327'::uuid)
                                                                       Buffers: shared hit=33
                                                                 ->  Bitmap Index Scan on color_schema_id_c798104e  (cost=0.00..21.62 rows=2042 width=0) (actual time=0.538..0.539 rows=4973 loops=3)
                                                                       Index Cond: (color_schema_id = '2ee66308-1b4b-4473-b2b8-717974ccbfdf'::uuid)
                                                                       Buffers: shared hit=27
                                                     ->  Index Scan using open_question_id_ff531421 on open_question u0_1  (cost=0.56..35.97 rows=1 width=48) (actual time=0.050..0.050 rows=0 loops=354)
                                                           Index Cond: (picture_id = u3_1.id)
                                                           Filter: ((final_answer IS NULL) AND ((final_answer  ALL ('{DELETED,DELETED_NOT_ANSWERED,NO_ONE_ANSWER}'::text[])) OR (final_answer IS NULL)) AND (book_id = ANY ('{ef4ad3c8-8ff5-411b-8577-202212202103,671d56b3-41f1-450b-afbf-202212191825,c986ce5e-36ad-44d0-a2ab-202212132131, ...}'::uuid[])))
                                                           Rows Removed by Filter: 27
                                                           Buffers: shared hit=4210
                                               ->  Index Scan using book_pkey on book u1_1  (cost=0.29..2.96 rows=1 width=16) (never executed)
                                                     Index Cond: (id = u0_1.book_id)
                                                     Filter: (project_id = '107e827e-346a-4178-bb53-7cbd2ff0d66c'::uuid)
                             ->  Index Scan using single_scoped_question_per_eta on scoped_question v0  (cost=0.42..3.37 rows=1 width=16) (never executed)
                                   Index Cond: (open_question_id = u0_1.id)
                                   Filter: ((date_submitted IS NULL) AND (type = 'RECALLED_ANSWER'::text))
 Planning:
   Buffers: shared hit=1491
 Planning Time: 4.680 ms
 Execution Time: 10165.867 ms
(101 rows)
What bothers me specifically is this part (node):
Parallel Index Only Scan using open_question_pkey on open_question  (cost=155.73..463070.55 rows=6055232 width=0) (actual time=5538.715..10154.776 rows=1 loops=3)
                     Filter: ((hashed SubPlan 1) OR (hashed SubPlan 2))
                     Rows Removed by Filter: 6471867
                     Heap Fetches: 4463629
                     Buffers: shared hit=18316732 read=77554
                     I/O Timings: read=150757.132
Why does Postgres so way off here? If I reduce the query to just count by specific open questions (as returned from the 2 subqueries), the result is very fast and uses the same index. I also suspect that since Postgres over estimate the number of returned rows from the filter (6471867 times 3 due to parallelism) it actually does some kind of sequence scan on 19M records. Would like to get better explanation on what's going inside Postgres
Cowabunga (145 rep)
Jan 4, 2023, 10:23 PM • Last activity: Dec 30, 2024, 03:02 PM
Showing page 1 of 20 total questions