Database Administrators

Q&A for database professionals who wish to improve their database skills

Latest Questions

0 votes

1 answers

2658 views

Why pg_wal dir does not clean up while logical replication

my main question - why pg_wal dir is extremely growing up? I have two postgresql instances. The first was set up as master: ``` wal_level: "logical" max_wal_senders: 5 max_replication_slots: 10 ``` Also, I was enable archiving for pg_wal: ``` archive_mode: on archive_command: 'gzip /.../data/pg_wal_...

my main question - why pg_wal dir is extremely growing up? I have two postgresql instances. The first was set up as master:

wal_level: "logical"
max_wal_senders: 5
max_replication_slots: 10

Also, I was enable archiving for pg_wal:

archive_mode: on
archive_command: 'gzip  /.../data/pg_wal_archive/%f'

After that, PUBLICATION was created for some of my tables (not ALL TABLES) For replication psql instance - I using default psql config, only created SUBSCRIPTION to master. And I have two questions: 1. I run some script on master, which pulling ~ 10 Gb data to one table, which was **not included** in the PUBLICATION. And, I see, that my pg_wal dir size is increasing anyway. Why? 2. Why pg_wal dir does not cleaning? Is it normal? Looks like I have some data for restore table which is not using in replication, how to delete this data?

vintik (1 rep)

Aug 12, 2019, 11:46 AM • Last activity: Aug 2, 2025, 11:04 AM

2 votes

2 answers

3240 views

Optimizing a recursive CTE or replacing it with a temporary table

query-performance postgresql-10 cte recursive

I have a recursive query like this: ``` with recursive PCte(id) as ( select p1.id from Product p1 where p1.parent_id is null and p1.serial_number in ('123', '124','125', 'n') union all select p2.id from Product p2 inner join PCte cte on p2.parent_id = cte.id ) select * from Product p left join Produ...

I have a recursive query like this:

with recursive PCte(id) as 
(
    select p1.id from Product p1 where p1.parent_id is null and p1.serial_number in
    ('123', '124','125', 'n')
    union all
    select p2.id from Product p2 inner join PCte cte on p2.parent_id = cte.id
) 
select * 
from Product p 
left join Product psub on psub.parent_id = p.id 
where p.id in (select c.id from PCte c)

This query have a slow performance with a large number of children, are there any possible optimization ways? If it is possible to replace the recursive CTE with a temporary table, I would like to see an example, thanks a lot.

Itan Reimbergh (21 rep)

Sep 28, 2021, 01:10 PM • Last activity: Aug 1, 2025, 02:06 AM

1 votes

1 answers

8731 views

postgresql: Invalid data directory | Can't open PID file /var/run/postgresql/10-main.pid (yet?) after start: No such file or directory

postgresql backup linux postgresql-10

This error is showing after running this command: `sudo systemctl status postgresql@10-main.service` postgresql@10-main.service - PostgreSQL Cluster 10-main Loaded: loaded (/lib/systemd/system/postgresql@.service; indirect; vendor preset: enabled) Active: failed (Result: protocol) since Tue 2020-12-...

                                  This error is showing after running this command: sudo systemctl status postgresql@10-main.service

    postgresql@10-main.service - PostgreSQL Cluster 10-main
       Loaded: loaded (/lib/systemd/system/postgresql@.service; indirect; vendor preset: enabled)
       Active: failed (Result: protocol) since Tue 2020-12-01 14:21:40 UTC; 2s ago
      Process: 1603 ExecStart=/usr/bin/pg_ctlcluster --skip-systemctl-redirect 10-main start (code=exited, sta
    
    Dec 01 14:21:40 ubuntu-s-1vcpu-2gb-sgp1-01 systemd: Starting PostgreSQL Cluster 10-main...
    Dec 01 14:21:40 ubuntu-s-1vcpu-2gb-sgp1-01 postgresql@10-main: Error: Invalid data directory
    Dec 01 14:21:40 ubuntu-s-1vcpu-2gb-sgp1-01 systemd: postgresql@10-main.service: Can't open PID file /va
    Dec 01 14:21:40 ubuntu-s-1vcpu-2gb-sgp1-01 systemd: postgresql@10-main.service: Failed with result 'pro
    Dec 01 14:21:40 ubuntu-s-1vcpu-2gb-sgp1-01 systemd: Failed to start PostgreSQL Cluster 10-main.


I have tried all solutions out there. I can't even take backup by pg_dump. Getting the following error while running pg_dump database_name > database_backup.bak:

    pg_dump: [archiver (db)] connection to database "database_name" failed: could not connect to server: No such file or directory
        Is the server running locally and accepting
        connections on Unix domain socket "/tmp/.s.PGSQL.5432"?

Here is the log of permissions of /var/lib/postgresql/10/main:

    -rwx------ 1 postgres postgres    3 Apr 28  2020 PG_VERSION
    drwx------ 7 postgres postgres 4096 Nov  9 23:36 base
    drwx------ 2 postgres postgres 4096 Nov 29 19:28 global
    drwx------ 2 postgres postgres 4096 Apr 28  2020 pg_commit_ts
    drwx------ 2 postgres postgres 4096 Apr 28  2020 pg_dynshmem
    drwx------ 4 postgres postgres 4096 Nov 29 20:18 pg_logical
    drwx------ 4 postgres postgres 4096 Apr 28  2020 pg_multixact
    drwx------ 2 postgres postgres 4096 Nov 29 19:27 pg_notify
    drwx------ 2 postgres postgres 4096 Apr 28  2020 pg_replslot
    drwx------ 2 postgres postgres 4096 Apr 28  2020 pg_serial
    drwx------ 2 postgres postgres 4096 Apr 28  2020 pg_snapshots
    drwx------ 2 postgres postgres 4096 Nov 29 20:18 pg_stat
    drwx------ 2 postgres postgres 4096 Apr 28  2020 pg_stat_tmp
    drwx------ 2 postgres postgres 4096 Sep  2 00:07 pg_subtrans
    drwx------ 2 postgres postgres 4096 Apr 28  2020 pg_tblspc
    drwx------ 2 postgres postgres 4096 Apr 28  2020 pg_twophase
    drwx------ 3 postgres postgres 4096 Nov 28 01:33 pg_wal
    drwx------ 2 postgres postgres 4096 Apr 28  2020 pg_xact
    -rwx------ 1 postgres postgres   88 Apr 28  2020 postgresql.auto.conf
    -rwx------ 1 postgres postgres   95 Nov 29 19:27 postmaster.opts

Here is the log after running sudo -u postgres psql this command:

     psql: could not connect to server: No such file or directory
    	Is the server running locally and accepting
    	connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"?


Please Assist me to fix this. Any kind of help will be appreciated. Thanks in advance.
                                

Farid Chowdhury (119 rep)

Dec 1, 2020, 04:10 PM • Last activity: Jul 27, 2025, 02:04 AM

0 votes

1 answers

2529 views

Getting "unknown variable postgresqlini" when updating PostgreSQL

postgresql windows postgresql-10

I am getting an error when attempting to upgrade a PostgreSQL installation. The full error is: >A data directory is registered in \*\*\*unknown variable postgresqlini\*\*\* but is neither empty or a recognized data directory I've already had a look at [this issue](https://dba.stackexchange.com/questions/224062/install-postgresql-in-unattended-error) , which seems to be the only other one I can find which is related, but it did not help me. As you can see, my paths are surrounded by quotation marks. Here is the full command that is run (separated to multiple lines to make it readable)

"C:\ProgramData\Package Cache\44103BBE02C9C99CB868F4767B0CC3B5D50B6C09\postgresql-10.11-1-windows-x64.exe"
--unattendedmodeui minimal
--mode unattended
--servicename RM_Postgresql
--servicepassword xyz
--superpassword xyz
--prefix "C:\Program Files\\company\postgres"
--datadir "C:\Program Files\\company\data"
--disable-components stackbuilder

This command is part of an installer which launches the PostgreSQL installer. I've been running the same command for nearly 2 years without any issues (using different versions of PostgreSQL on different computers). For some reason, it seems to now be failing on one of our machines, and I am unsure as to why. The currently installed version of PostgreSQL on the computer where it is failing is 10.6-1, and I am attempting to update to 10.11-1 if that makes any difference. This now seems to cause issue with the previous installation as well:

initializeDatabase fail on opening postgres msg=could not connect to server: Connection refused (0x0000274D/10061)
	Is the server running on host "localhost" (::1) and accepting
	TCP/IP connections on port 5432?
could not connect to server: Connection refused (0x0000274D/10061)
	Is the server running on host "localhost" (127.0.0.1) and accepting
	TCP/IP connections on port 5432?
QPSQL: Unable to connect

ChrisMM (119 rep)

Feb 18, 2020, 12:30 PM • Last activity: Jul 21, 2025, 05:06 PM

1 votes

1 answers

157 views

SELECT operation with four joins results in bad performance

postgresql performance postgresql-10 query-performance

I'm having problems with a slow query. The query purpose is to get doers ids for X job who didn't exceed specified limits on realizations and also are in range of possible job places. **Resources** - Query: SELECT DISTINCT doers.id FROM doers JOIN doer_locations dl ON dl.doer_id = doers.id JOIN job_...

                                  I'm having problems with a slow query. The query purpose is to get doers ids for X job who didn't exceed specified limits on realizations and also are in range of possible job places.

**Resources**

- Query:
    
        SELECT DISTINCT doers.id 
        FROM doers
        JOIN doer_locations dl 
            ON dl.doer_id = doers.id
        JOIN job_places jp 
            ON (jp.lat - 0.3147625620715557)  dl.lat 
           AND (jp.lng - 0.5001626620527362)  dl.lng
        LEFT JOIN job_realizations jr 
            ON jr.job_place_id = jp.id 
           AND jr.status IN (1, 2, 3, 4)
        LEFT JOIN job_realizations jrpd 
            ON jrpd.job_place_id = jp.id 
           AND jrpd.doer_id = doers.id 
           AND jrpd.status IN (1, 2, 3, 4)
        WHERE (jp.job_id = 1 AND doers.id IS NOT NULL)
        GROUP BY doers.id, jp.id
        HAVING COUNT(DISTINCT jr.id) < jp.realizations_per_place 
           AND COUNT(DISTINCT jrpd.id) < jp.realizations_per_place_per_doer

- [Depesz explain](https://explain.depesz.com/s/jMHP) 

- [Raw explain analyze](https://pastebin.com/aY4L33iY) 


- [Simplified Schema](https://www.db-fiddle.com/f/r2LFERkLEoo9fKYeTKQ8S6/1) 

**Consideration**

I'm not sure if I read the explain correctly but it seems it loses on performance especially when it calculates stuff on the run also HAVING COUNT(DISTINCT) seems pretty expensive.

**Additional information**

The type of both the lat and long columns is float.
                                

mist (29 rep)

Oct 1, 2018, 11:22 AM • Last activity: Jul 11, 2025, 05:06 AM

3 votes

1 answers

441 views

Installing Postgresql parray_gin extension on mac, include not found

postgresql installation mac-os-x postgresql-10 postgresql-extensions

I am trying to install `parray_gin` in PostgreSQL 10 on macOS, but I can't get it to build. I'm running `pgxn install parray_gin`. This first gave me a missing `stdio.h` error, so I installed the xcode command line tools, and now it's complaining about /Library/PostgreSQL/10/include/postgresql/serve...

                                  I am trying to install parray_gin in PostgreSQL 10 on macOS, but I can't get it to build.

I'm running pgxn install parray_gin. This first gave me a missing stdio.h error, so I installed the xcode command line tools, and now it's complaining about

    /Library/PostgreSQL/10/include/postgresql/server/utils/pg_locale.h:19:10: 
    fatal error: 'unicode/ucol.h' file not found

After some Googling it appeared I needed to install an extra library, which I did through brew install icu4c

This printed:

    This formula is keg-only, which means it was not symlinked into /usr/local,
    because macOS provides libicucore.dylib (but nothing else).

    If you need to have this software first in your PATH run:
        echo 'export PATH="/usr/local/opt/icu4c/bin:$PATH"' >> ~/.bash_profile
        echo 'export PATH="/usr/local/opt/icu4c/sbin:$PATH"' >> ~/.bash_profile

    For compilers to find this software you may need to set:
        LDFLAGS:  -L/usr/local/opt/icu4c/lib
        CPPFLAGS: -I/usr/local/opt/icu4c/include

But it didn't change a thing and I have no idea how to pass /usr/local/opt/icu4c/include to pgxn

Joris (131 rep)

Feb 28, 2018, 08:16 AM • Last activity: Jun 5, 2025, 08:05 PM

0 votes

1 answers

237 views

Actual used and free space in WAL segments

postgresql transaction-log postgresql-10

I am new to the PostgreSQL world. I have an experience in administrating Microsoft SQL Server and Oracle databases. In SQL Server, we are able to find how much space is actually used in pre-allocated log file. I wonder if we can do the same thing in PostgreSQL? How can we identify which WAL segments...

                                  I am new to the PostgreSQL world. I have an experience in administrating Microsoft SQL Server and Oracle databases. 

In SQL Server, we are able to find how much space is actually used in pre-allocated log file. I wonder if we can do the same thing in PostgreSQL?

How can we identify which WAL segments are actually full and which WAL segments are empty?

I am currently using PostgreSQL version 10.

Rauf Asadov (1313 rep)

Jun 11, 2021, 08:44 AM • Last activity: May 31, 2025, 10:03 PM

1 votes

1 answers

249 views

Postgresql streaming replication slow on macOS

postgresql replication postgresql-10 master-slave-replication

I am using PostgreSQL 10.1 on MAC on which I am trying to set up streaming replication. I configured both master and slave to be on the same machine. I find the streaming replication lag to be slower than expected on mac. The same test runs on a Linux Ubuntu 16.04 machine without much lag. I have th...

                                  I am using PostgreSQL 10.1 on MAC on which I am trying to set up streaming replication. I configured both master and slave to be on the same machine. I find the streaming replication lag to be slower than expected on mac. The same test runs on a Linux Ubuntu 16.04 machine without much lag.

I have the following insert script. 

    for i in $(seq 1 1 1000)
    do
    bin/psql postgres -p 8999 -c "Insert into $1 select tz, $i * 127361::bigint, $i::real, random()*12696::bigint from generate_series('01-01-2018'::timestamptz, '02-01-2018'::timestamptz, '30 sec'::interval)tz;"
    echo $i
    done

The lag is measured using the following queries,

    SELECT pg_last_wal_receive_lsn() - pg_last_wal_replay_lsn();
    SELECT (extract(epoch FROM now()) - extract(epoch FROM pg_last_xact_replay_timestamp()))::int;

However, the observation is very unexpected. The lag is increasing from the moment the transactions are started on master. 

    Slave    localhost_9001: 12680304		 1	
    Slave    localhost_9001: 12354168		 1	
    Slave    localhost_9001: 16086800		 1	
    .
    .
    .
    Slave    localhost_9001: 3697460920		 121	
    Slave    localhost_9001: 3689335376		 122	
    Slave    localhost_9001: 3685571296		 122
    .
    .
    .
    .
    Slave    localhost_9001: 312752632		 190	
    Slave    localhost_9001: 308177496		 190	
    Slave    localhost_9001: 303548984		 190
    .
    .
    Slave    localhost_9001: 22810280		 199	
    Slave    localhost_9001: 8255144		 199	
    Slave    localhost_9001: 4214440		 199	
    Slave    localhost_9001: 0		 0	

It took around 4.5 minutes for a single client inserting on a single table to complete on master and another 4 minutes for the slave to catch up. Note that NO simultaneous selects are run other than the script to measure the lag.

I understand that replay in PostgreSQL is pretty simple like, "move a particular block to a location", but I am not sure about this behavior.

I have the following other configurations,

    checkpoint_timeout = 5min		
    max_wal_size = 1GB
    min_wal_size = 80MB


Now, I run the same tests with same configurations on a Linux Ubuntu 16.04 machine, I find the lag perfectly reasonable. 

Am I missing anything?


UPDATE :

MAC Hardware configurations :
16GB RAM, 512GB SSD (Nvme)
Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz

Linux hardware configurations :
16GB RAM, 512GB SSD (Nvme)
Intel(R) Core(TM) i7-6600U CPU @ 2.60GHz
                                

krithikaGopalakrishnan (181 rep)

Dec 17, 2018, 02:30 PM • Last activity: May 24, 2025, 08:05 PM

0 votes

2 answers

324 views

Postgres file increment size

postgresql windows postgresql-10 windows-server

I'm COPY-ing 300 tables / 400G to PG10@win2016server using a few threads to speed things up, but because the table files are incremented in 64-kb units (or something there-about) they get fragmented on disk which hurt read performance. Ideally I would like to make sure that files are expanded in lar...

                                  I'm COPY-ing 300 tables / 400G to PG10@win2016server using a few threads to speed things up, but because the table files are incremented in 64-kb units (or something there-about) they get fragmented on disk which hurt read performance.

Ideally I would like to make sure that files are expanded in larger increments (eg. 8 mb), alternative would be to make initial size = max segment size (1gb). Is there any way to accomplish this?

And I know, single threaded copy would solve the issue at hand, but I would still have the same problem after the copying, as tables grow continually and still only in small fragmented chunks.

EDIT:
As a "proof" of sorts, i created a simple table

    CREATE TABLE testtable 
    (
      id bigint primary key,
      somestring varchar(4000)
    );

    BEGIN
      FOR i IN 1..2000 LOOP
       insert into testtable values(i, repeat('Jeg kan en sang der kan drive dig til vanvid.', 40));
     END LOOP;
    END

which produces a table of 4096kb, which contig.exe reports to be of 25 fragments (in this instance), or 160kb on average. For 4 mb this is not an issue, for 400gb it is.
                                

mikkel (238 rep)

Oct 23, 2017, 01:13 PM • Last activity: May 11, 2025, 11:04 PM

2 votes

2 answers

1133 views

How to extract the grants on default roles of PostgreSQL 10: pg_read_all_settings, pg_read_all_stats, pg_monitor and pg_signal_backend?

postgresql postgresql-10

I have a Postgres instance of version 10 running on AWS RDS. I have a user there with default roles granted to it. I want to create the same user on my sandbox where the instance is of version 9.4. Thus, I want to know the grants that are included in these default roles of version 10.

                                  I have a Postgres instance of version 10 running on AWS RDS. I have a user there with default roles granted to it. I want to create the same user on my sandbox where the instance is of version 9.4. Thus, I want to know the grants that are included in these default roles of version 10. 

                                

pricha (143 rep)

Jul 18, 2019, 08:32 AM • Last activity: May 4, 2025, 04:11 AM

2 votes

1 answers

1115 views

Any workarounds for lack of primary key on partitioned tables?

partitioning postgresql-10

I am currently looking at implementing declarative partitioning with PostgreSQL 10. However, one sticking point is that we need to have a foreign key between two such partitioned tables, which is not supported. Is there any workaround or suggestions for alternative ways of doing this anyone can sugg...

                                  I am currently looking at implementing declarative partitioning with PostgreSQL 10. However, one sticking point is that we need to have a foreign key between two such partitioned tables, which is not supported. Is there any workaround or suggestions for alternative ways of doing this anyone can suggest? Or should we use table inheritance instead.

Thanks.

J. Gallagher (29 rep)

Jul 12, 2018, 09:17 AM • Last activity: Apr 26, 2025, 03:09 PM

1 votes

1 answers

1047 views

Duplicate rows with primary key constraint in PostgreSQL 10.6

postgresql primary-key postgresql-10 duplication debugging

I have a table that's had `CONSTRAINT table_pkey PRIMARY KEY (id)` since setup. However, I recently found that I have around 100 duplicate rows (I ran `select count(*) from (select count(*) from table group by id having count(*) > 1) as t1` as per [this old stack post][1]. However, I'm running Postg...

                                  I have a table that's had CONSTRAINT table_pkey PRIMARY KEY (id) since setup.

However, I recently found that I have around 100 duplicate rows (I ran select count(*) from (select count(*) from table group by id having count(*) > 1) as t1 as per this old stack post . However, I'm running Postgres 10.6, which should be far after that bug.  
Any ideas?
                                

mckennab (111 rep)

Jul 3, 2019, 11:33 PM • Last activity: Apr 18, 2025, 04:05 AM

0 votes

1 answers

485 views

Is renaming schemas more efficient / faster than drop/create in a single transaction

postgresql restore postgresql-10

Given the constraint (from customer side) to having only one *database* - how can I most effectively deploy new versions of data into a production-facing application. The new data is packaged into a (`-Fc`) dump file. This data is supposed to *replace* the existing data, so restoring it must be done...

                                  Given the constraint (from customer side) to having only one *database* - how can I most effectively deploy new versions of data into a production-facing application.

The new data is packaged into a (-Fc) dump file. This data is supposed to *replace* the existing data, so restoring it must be done by first dropping the existing tables and then restoring the new ones.

The naive approach of just importing the dump file with pg_restore with --single-transaction will lock the whole database which will cause the application dependent on that data to halt/suspend queries.

My intended approach is to utilize schemas. Basically, the application will have set its connection string to use mydb.public. The import imports all new data into mydb.next.

Then, rename mydb.public to mydb.old, rename mydb.next to mydb.public and drop mydb.old.

Would this approach result in a shorter downtime than importing directly? I am not sure how renames of schemas work internally in postgres, but from my surface understanding, it should be a more efficient approach

F.P (207 rep)

Aug 26, 2021, 05:48 AM • Last activity: Mar 14, 2025, 09:07 AM

0 votes

1 answers

809 views

Fastest way to check if record is related to a particular user in postgres

postgresql performance postgresql-10 query-performance

While implementing a `sharelikes` table I added a procedure to add a new record for a like. I wanted to add a check before insert that the item being liked does not belong to the users who submitted the like request. Given function parameters `Pshareid` and `Puserid` I used the following conditional...

                                  While implementing a sharelikes table I added a procedure to add a new record for a like. 

I wanted to add a check before insert that the item being liked does not belong to the users who submitted the like request. Given function parameters Pshareid and Puserid I used the following conditional in the procedure



    SELECT INTO checkuser true FROM ks.shares 
    WHERE id = Pshareid AND userid = Puserid limit 1;
    IF checkuser THEN
      -- Cannot like your own share
    ELSE
        --INSERT the new record
    END IF;

In the select described above the record will almost always not exist.

Is that more efficient than a SELECT on the share id to return the record (if it exists which is very likely at the point of select) and then to check the userid of the returned record?

    SELECT userid INTO checkuser FROM ks.shares 
    WHERE id = Pshareid limit 1;
    IF checkuser = Puserid THEN
        -- Cannot like your own share
    ELSE
        --INSERT the new record
    END IF;

                                

Russell Ormes (193 rep)

May 29, 2018, 08:39 AM • Last activity: Mar 13, 2025, 11:03 AM

4 votes

1 answers

1418 views

Why is restart_lsn not set, even on checkpoint - postgresql

replication postgresql-10

> A logical slot will emit each change just once in normal operation. > The current position of each slot is persisted only at checkpoint, so > in the case of a crash the slot may return to an earlier LSN, which > will then cause recent changes to be resent when the server restarts. > Logical decodi...

                                  > A logical slot will emit each change just once in normal operation.
> The current position of each slot is persisted only at checkpoint, so
> in the case of a crash the slot may return to an earlier LSN, which
> will then cause recent changes to be resent when the server restarts.
> Logical decoding clients are responsible for avoiding ill effects from
> handling the same message more than once. Clients may wish to record
> the last LSN they saw when decoding and skip over any repeated data or
> (when using the replication protocol) request that decoding start from
> that LSN rather than letting the server determine the start point

Quoted from [here](https://www.postgresql.org/docs/10/logicaldecoding-explanation.html) 

I'm implementing a logical (streaming) replication. 

Thus far things are looking good but I'm unable to understand why restart_lsn entry is not changed. 

i.e I do this.. 

    SELECT * FROM pg_replication_slots;
     slot_name |  plugin  | slot_type | datoid | database | active | active_pid | xmin | catalog_xmin | restart_lsn
    -----------+----------+-----------+--------+----------+--------+------------+------+--------------+-------------
     pgx_test  | wal2json | logical   |  12641 | postgres | t      |      16045 |      |      8544916 | 2/B1BC5330 

Insert 2/3 record 


    SELECT * FROM pg_stat_replication;
      pid  | usesysid |     usename     | application_name | client_addr | client_hostname | client_port |          backend_start           | backend_xmin |   state   | sent_location | write_location | flush_location | replay_location | sync_priority | sync_state
    -------+----------+-----------------+------------------+-------------+-----------------+-------------+----------------------------------+--------------+-----------+---------------+----------------+----------------+-----------------+---------------+------------
     16315 |  1601109 | pgx_replication |                  | 127.0.0.1   |                 |       50112 | 2019-03-11 17:33:18.150179+05:30 |              | streaming | 2/B1BCA0F8    |                |                |                 |             0 | async
    

The sent_location gets updated but not the restart_lsn, even though I ran a checkpoint command had set the checkoint_timeout=1m but still the replay_lsn does not change.

    SELECT * FROM pg_replication_slots;
     slot_name |  plugin  | slot_type | datoid | database | active | active_pid | xmin | catalog_xmin | restart_lsn
    -----------+----------+-----------+--------+----------+--------+------------+------+--------------+-------------
     pgx_test  | wal2json | logical   |  12641 | postgres | t      |      16315 |      |      8544916 | 2/B1BC5330
    (1 row)



Any reason why this happens?
                                

Noobie (165 rep)

Mar 11, 2019, 12:24 PM • Last activity: Mar 8, 2025, 05:03 AM

0 votes

1 answers

460 views

In AWS Aurora's performance insights. What qualifies as "Other" and how do I investigate it?

postgresql-10 aws-aurora

I have a disproportionate amount of queries that are marked as "Other" in performance insights with no additional details. AWS doesn't seem to provide any details about what this means, and asking on the forum doesn't seem to work. Is there a way to get details about what falls into Other? [![enter...

                                  I have a disproportionate amount of queries that are marked as "Other" in performance insights with no additional details. AWS doesn't seem to provide any details about what this means, and asking on the forum doesn't seem to work.

Is there a way to get details about what falls into Other?

Kit Sunde (363 rep)

Apr 1, 2020, 05:22 AM • Last activity: Feb 25, 2025, 11:01 AM

0 votes

1 answers

1758 views

COPY command only works as command line argument, not as part of script

postgresql postgresql-10

No rows are being copied when I try to copy a CSV from stdin. To be more specific, this *works when writing the command on the cli, but not when read as part of a file*. Why is reading SQL commands from a file different from executing them directly as an argument for `-c`? ``` cat extract.csv | psql...

cat extract.csv  | psql postgres -f copy-user.sql
Timing is on.
Line style is unicode.
Border style is 2.
Null display is "[NULL]".
Expanded display is used automatically.
COPY 0
Time: 41.685 ms

$ cat extract.csv  | psql postgres -c "$(cat copy-user.sql)"
Timing is on.
Line style is unicode.
Border style is 2.
Null display is "[NULL]".
Expanded display is used automatically.
COPY 10
Time: 31.685 ms

**copy-user.sql**

copy to_update from stdin (delimiter ';', format csv);

**extract.csv**

bfb92e29-1d2c-45c4-b9ab-357a3ac7ad13;test@test90239023783457843.com;x
aeccc3ea-cc1f-43ef-99ff-e389d5d63b22;tester@testerkjnaefgjnwerg.no;x
9cec13ae-c880-4371-9b1c-dd201f5cf233;bloblo@gmail.com;x
aeada2bc-a362-4f3e-80f2-06a717206802;veaat@gmail.com;x
fb85ddd8-7d17-4d41-8bc3-213b1e469506;navnnavnesen@foo.com;x
528e1f2e-1baa-483b-bc8c-85f993014696;kklk@hotmail.com;x
dbc8a9c1-56cf-4589-8b2c-cf1a2e0832ed;ghiiii@hotmail.com;x
fbf23553-baa2-410a-8f96-32b5c4deb0c7;lala@lala.no;x
e22ec0de-06f9-428a-aa3e-171c38f9a1f7;x2aa@gmail.com;x
8e8d0f73-8eb7-43b4-8019-b79042731b97;mail@mail.com;x

**table definition for to_update**

create table to_update(id text, email text, text char);

oligofren (165 rep)

Oct 15, 2019, 08:44 PM • Last activity: Feb 5, 2025, 02:07 AM

0 votes

0 answers

38 views

Why does query planning time increase on a standby replica with increase in dead tuples but not on the primary?

postgresql replication postgresql-performance execution-plan postgresql-10

We are using Postgres 10 with standby replication (read replica in recovery mode). For the same query, we observe significantly higher planning time on the replica compared to the primary. Execution time remains similar. The query is like so ``` SELECT * FROM emp, sessions WHERE emp.id = ' ' AND emp...

SELECT *  
FROM emp, sessions  
WHERE emp.id = ''  
  AND emp.active = 't'  
  AND sessions.emp_id = emp.id;

- sessions table: ~250M records - emp table: ~100M records - Both primary and replica are on similar VMs (same RAM, HDD, processor, freespace etc.). Observations: - High number of dead tuples for sessions table (~75k dead tuples added/day), that is constantly increasing. - As dead tuples increase, planning time on the replica grows, but it remains constant on the primary. - Running VACUUM (on primary) temporarily fixes the issue, normalizing planning time on the replica. [Cannot run VACUUM on replica as it's in readonly/recovery mode] - Running ANALYZE on the primary does not help. The statistics seem to be the same - Dead tuples are created frequently due to a maintenance job that deletes records from the sessions table every 10 seconds. We're working on fixing this. - Replication lag is minimal and doesn't appear to be a factor (the records are definitely present in replica) Action we've taken till now - running auto-vacuum more aggressively so that the dead tuple count does not get higher - DB maintenance job fix **Question still remains, why does the same number of dead tuples not affect the primary's planning time but impacts the replica's planning time?** **Primary**

QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=1.26..17.30 rows=1 width=451) (actual time=0.054..0.055 rows=1 loops=1)
   Buffers: shared hit=11
   ->  Index Scan using emp_pkey on emp a  (cost=0.69..8.71 rows=1 width=138) (actual time=0.036..0.037 rows=1 loops=1)
         Index Cond: (id = '123456'::text)
         Filter: active
         Buffers: shared hit=6
   ->  Index Scan using sessions_pkey on sessions g  (cost=0.57..8.59 rows=1 width=313) (actual time=0.015..0.015 rows=1 loops=1)
         Index Cond: (id = a.id)
         Buffers: shared hit=5
 Planning time: 23.652 ms
 Execution time: 0.096 ms

**Replica**

QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=1.26..17.30 rows=1 width=451) (actual time=0.062..0.064 rows=1 loops=1)
   Buffers: shared hit=11
   ->  Index Scan using emp_pkey on emp a  (cost=0.69..8.71 rows=1 width=138) (actual time=0.045..0.046 rows=1 loops=1)
         Index Cond: (id = '123456'::text)
         Filter: active
         Buffers: shared hit=6
   ->  Index Scan using sessions_pkey on sessions g  (cost=0.57..8.59 rows=1 width=313) (actual time=0.014..0.014 rows=1 loops=1)
         Index Cond: (id = a.id)
         Buffers: shared hit=5
 Planning time: 854.801 ms
 Execution time: 0.102 ms

Aftab (1 rep)

Jan 23, 2025, 03:21 PM

2 votes

1 answers

1109 views

Arbitrary queries on n:m relationship, including "all" and "any"

postgresql performance postgresql-9.6 postgresql-10 relational-division postgresql-performance

I'm using postgres >= 9.6. I have tasks, tags and task_tags tables, for a typical n:m relationship between tasks and tags. I'd like to be able to programmatically build queries against the tasks table that supports queries against the actual fields of tasks, but also on the tags (tag names) of a tas...

                                  I'm using postgres >= 9.6. I have tasks, tags and task_tags tables, for a typical n:m relationship between tasks and tags.

I'd like to be able to programmatically build queries against the tasks table that supports queries against the actual fields of tasks, but also on the tags (tag names) of a task.

Queries on the task fields themselves are straight-forward. Queries on the tags for a "does it have tag A?" are also straight-forward. What I am struggling with is coming up with a select/query structure that would allow me to also match things like "does it have tag A and tag B"?

The best I've come up with is a lateral join on a subquery with an array aggregation, and then using array matching functions, e.g.:

    SELECT DISTINCT ON (tasks.id) tasks.* 
    FROM   tasks, LATERAL (SELECT array_agg(tags.name) AS tags 
                           FROM   task_tags 
                           INNER JOIN tags 
                           ON task_tags.tag_id = tags.id 
                           WHERE task_tags.task_id = tasks.id 
                           GROUP BY task_tags.task_id) tt
    WHERE  tt.tags @> array['tag1'::varchar, 'tag3'::varchar];

That way, it should be possible to programmatically build a WHERE clause (using tasks.* and tt.tags) satisfying all of the conditions of the user-provided "query".

However, I'm not sure if this is the best way of doing it - thoughts? Is this query any efficient? Is there any index I could create that would improve it?

Similarly, is there any way at all of making it work with wildcards against the tag names? Normal array matching wouldn't allow that, and solutions I've seen suggest using unnest (or, well, not using arrays in the first place), but then I'd lose the ability of saying "it needs to have both tagA and tagB".

Is there any other way of building a query on these relationships that would allow that kind of "both tagA and tagB" matching?
                                

Alex Hornung (123 rep)

Dec 31, 2017, 09:34 AM • Last activity: Jul 22, 2024, 12:50 AM

14 votes

4 answers

6100 views

Postgres jsonb vs composite type performance differences

postgresql database-design datatypes postgresql-10 composite-types

What considerations are involved in choosing between a jsonb column and a composite type column of the same structure? For example, consider a column like that used in the Postgres documentation: CREATE TYPE inventory_item AS ( name text, supplier_id integer, price numeric ); What are the tradeoffs...

                                  What considerations are involved in choosing between a jsonb column and a composite type column of the same structure?

For example, consider a column like that used in the Postgres documentation:

    CREATE TYPE inventory_item AS (
    name            text,
    supplier_id     integer,
    price           numeric
    );

What are the tradeoffs involved between this approach vs a jsonb column mirroring this structure?

For example, I suspect that the composite type won't require storing the key names for every record, whereas the jsonb type would require this. 
                                

Metropolis (243 rep)

Jan 17, 2019, 04:07 PM • Last activity: Jun 17, 2024, 08:18 PM

Showing page 1 of 20 total questions