Database Administrators

Q&A for database professionals who wish to improve their database skills

Latest Questions

2 votes

1 answers

237 views

Use case of ALTER EXTENSION ADD/DROP in upgrade script of extensions

I am recently looking at the `pg_stat_statements` upgrade script and found out these set of commands [here][1] ``` /* First we have to remove them from the extension */ ALTER EXTENSION pg_stat_statements DROP VIEW pg_stat_statements; ALTER EXTENSION pg_stat_statements DROP FUNCTION pg_stat_statement...

I am recently looking at the pg_stat_statements upgrade script and found out these set of commands here

/* First we have to remove them from the extension */
ALTER EXTENSION pg_stat_statements DROP VIEW pg_stat_statements;
ALTER EXTENSION pg_stat_statements DROP FUNCTION pg_stat_statements(boolean);

/* Then we can drop them */
DROP VIEW pg_stat_statements;
DROP FUNCTION pg_stat_statements(boolean);

I didn't understand the use case of ALTER EXTENSION ADD/DROP here as even after removing those lines the extension is working fine. Can someone explain why do we need those commands if there is no strict requirement? Here is what I found from sql-alterextension.html > DROP member_object This form removes a member object from the extension. This is mainly useful in extension update scripts. The object is not dropped, only disassociated from the extension. I understand that it is disassociating the object from extension but again what's the need if we are going to drop it anyway?

Ayush Vatsa (55 rep)

Feb 6, 2024, 08:49 AM • Last activity: Jun 8, 2025, 04:05 AM

0 votes

0 answers

40 views

autovacuum, vacuum, analyize reset table index usage counters?

postgresql index statistics vacuum pg-stat-statements

I am trying to get rid of unused indexes. The index usage is very much depends on the output of pg_stat_user_indexes.idx_scan (correct me if i am wrong). Just wondering what are the impacts of autovacuum or vacuum and auto analyze or analyze on the index usage counters. Will it reset the counters pg...

                                  I am trying to get rid of unused indexes. The index usage is very much depends on the output of pg_stat_user_indexes.idx_scan (correct me if i am wrong). Just wondering what are the impacts of autovacuum or vacuum and auto analyze or analyze on the index usage counters. Will it reset the counters pg_stat_user_indexes.idx_scan to zero every time the autovacuum or vacuum and auto analyze or analyze completed? I am looking at the potential mistakenly removal useful indexes after the usage counters are set to zeros by unknown processes. Is there other process can potentially change the counters apart by calling pg_stat_database, pg_stat_reset_single_table_counters, rebuild index, removal, recreating index, change the column definitions?  

I hope some of your smart brains can shine some light on this. Thank ahead.

toanwa

Aug 14, 2024, 02:40 AM • Last activity: Aug 14, 2024, 05:01 AM

0 votes

1 answers

249 views

PostgreSQL: What are different operations make the use of Temp files?

postgresql maintenance vacuum pg-stat-statements

We have time series monitoring on both `pg_stat_database` as well as `pg_stat_statements`. Recently We have been observing temp file usage in our DB during certain period. But for the same period we are not able to find queries that in total equates to total usage from pg_stat_database. We have seen...

                                  We have time series monitoring on both pg_stat_database as well as pg_stat_statements. Recently We have been observing temp file usage in our DB during certain period. But for the same period we are not able to find queries that in total equates to total usage from pg_stat_database. We have seen somewhere maintenance operations like vacuuming and analyse can cause temp file usage, but for the period we couldn't find such activities as well.  Is there any other internal operations which can cause temp file usage? 
                                

goodfella (595 rep)

May 21, 2024, 09:31 AM • Last activity: May 23, 2024, 07:00 AM

1 votes

1 answers

2945 views

pg_stat_statements slows down the database

postgresql aws-aurora amazon-rds-aurora pg-stat-statements

I have an aws aurora postgresql cluster running posgresql 13.7 engine. The cluster is using the default parameter group. These are the default settings used for pg_stat_statements in the parameter group (i use the default parameter group - this exact one, for other clusters as well). [![pg_stat_stat...

Basically nothing is changed from the default values. I use multiple glue jobs to write to this aurora cluster and I encounter an issue with the pg_stat_statement waits after a couple of runs. If i do nothing about it, the cluster will automatically restart stating that the database processes have been killed due to the long runtime. If i restart the cluster or if i use

SELECT pg_stat_statements_reset();

The pg_stat_statements view gets a reset and everything runs as it should. Note: if the database gets restarted, right after restart i can run the same load (exactly the one that failed) with no issues. Note2: I have another cluster with the same settings (size and parameter group), running the same type of jobs that does not manifest like this. Example run1 - LWLock:pg_stat_statements in blue

Example run2 - LWLock:pg_stat_statements in yellow

I already adapted my glue jobs to be less stressful for the cluster by lowering the number of dpu's and implicitly running less parralel inserts into it. Any ideas regarding where i should pick this up from are greatly appreciated.

mario (13 rep)

Dec 12, 2022, 04:50 PM • Last activity: Mar 13, 2023, 04:00 AM

3 votes

1 answers

3126 views

ERROR: invalid byte sequence for encoding "UTF8": 0x00 in pg_stat_statements

postgresql postgresql-13 pg-stat-statements

I'm attempting to use pg_stat_statements to optimise my queries but ran into an unexpected roadblock. ```sql SELECT total_plan_time + total_exec_time as total_time, query FROM pg_stat_statements ORDER BY total_time DESC LIMIT 10; ``` results in `ERROR: invalid byte sequence for encoding "UTF8": 0x00...

I'm attempting to use pg_stat_statements to optimise my queries but ran into an unexpected roadblock.

SELECT total_plan_time + total_exec_time as total_time, query
FROM pg_stat_statements
ORDER BY total_time
DESC LIMIT 10;

results in ERROR: invalid byte sequence for encoding "UTF8": 0x00 I looked into this and it seems Postgres doesn't like the NULL character \00 in text fields. Most of the existing advice online on this error are for people seeing errors inserting data into Postgres. In which case, the fix seems to be to filter the null character prior to the insert. In this case, it seems the data is already IN postgres but it makes the view impossible to query. a \d+ pg_stat_statements tells me that the pg_stat_statements view is built from running a function I've tried to get rid of the character using translate and replace but no luck. Any idea how I can find the offending query with the NULL character? I'm assuming the long term fix is to trace the query with the NULL character, figure out how it's getting in. **What I've tried so far:** - I verified that its definitely bad data by doing a SELECT pg_stat_statements_reset();. The above query immediately works after that for a short while. - I did a \ef pg_stat_statements and it seems the first argument is a boolean called show text. Passing false lets me query the table, which lets me do this:

SELECT total_plan_time + total_exec_time as total_time, queryid
FROM pg_stat_statements(false)
ORDER BY total_time
DESC LIMIT 10;

unfortunately the queryid isn't very useful unless there's someway for me to retrieve the mapped query text without this error. Suggestions on how to proceed are appreciated! I'm on PostgreSQL 13.4

shrumm (43 rep)

Jan 19, 2022, 05:40 PM • Last activity: Nov 10, 2022, 09:40 AM

1 votes

1 answers

1220 views

Why does pg_stat_statement need to be included in shared_preload_libraries?

postgresql parameter pg-stat-statements

I am working on AWS RDS Postgres v9.6, and also v14. I noticed that in our `shared_preload_libraries` parameter, we have `pg_stat_statement` included. I don't understand why that needs to be there. In the Postgres [docs][1], it says this: >The module must be loaded by adding pg_stat_statements to sh...

                                  I am working on AWS RDS Postgres v9.6, and also v14. I noticed that in our shared_preload_libraries parameter, we have pg_stat_statement included. I don't understand why that needs to be there.

In the Postgres docs , it says this:
>The module must be loaded by adding pg_stat_statements to shared_preload_libraries in postgresql.conf, because it requires additional shared memory. This means that a server restart is needed to add or remove the module.

However, several other extensions are already built in - for example we use dblink and hstore. These do not need to be in shared_preload_libraries. Why does pg_stat_statement behave differently?

I think shedding light on this would help me (and hopefully others) better understand how built in libraries work, especially as regards the shared_preload_libraries parameter. The docs don't seem to say enough on this topic, or else I don't know where to find it.

Mark Grobaker (23 rep)

Mar 10, 2022, 09:18 PM • Last activity: Mar 11, 2022, 09:13 AM

0 votes

0 answers

94 views

Unable to use equal operator in where clause on `queryid` from `pg_stat_statements`

postgresql pg-stat-statements

When analyzing an issue on a PostgreSQL 13 server we can successfully retrieve information querying from `pg_stat_statements`, but we are unable to retrieve a single record using a `queryid = -1234567600` in the where clause. So this does not work: ``` -- given that queryid -1234567600 really exists...

When analyzing an issue on a PostgreSQL 13 server we can successfully retrieve information querying from pg_stat_statements, but we are unable to retrieve a single record using a queryid = -1234567600 in the where clause. So this does not work:

-- given that queryid -1234567600 really exists
select query, queryid from pg_stat_statements where queryid = -1234567600;

But the strange thing is, this DOES work:

SELECT                                                                          
    queryid,                                                                    
    query                                       
FROM                                                                            
    pg_stat_statements                                                          
where
    queryid=-1234567700;

I was under the impression that the former should also work. Am I misunderstanding things? Thanks!

Justin Zandbergen (1 rep)

Feb 10, 2022, 10:24 AM

0 votes

1 answers

3953 views

Analyzing queries with high disk IO

postgresql amazon-rds-aurora pg-stat-statements

RDS Aurora PostgreSQL 10.14 instance db.r5.4xlarge. I'm trying to figure out some high RDS IO costs in my machine. I'm looking at the pg_stat_statements and asking whether the following query make sense: SELECT rolname::regrole, calls, round((total_time / 1000 / 60)::numeric, 3) as total_minutes, ro...

                                  RDS Aurora PostgreSQL 10.14 instance db.r5.4xlarge.

I'm trying to figure out some high RDS IO costs in my machine.
I'm looking at the pg_stat_statements and asking whether the following query make sense:

    SELECT rolname::regrole,
           calls,
           round((total_time / 1000 / 60)::numeric, 3)                             as total_minutes,
           round(((total_time / 1000) / calls)::numeric, 3)                        as average_time_seconds,
           rows,
           userid,
           regexp_replace(query, '[ \t\n]+', ' ', 'g')                             AS query_text,
           100.0 * shared_blks_hit / nullif(shared_blks_hit + shared_blks_read, 0) AS hit_percent,
           pg_size_pretty((shared_blks_hit + shared_blks_read) * 8192)             AS total_memory_read
    FROM pg_stat_statements
             JOIN pg_roles r
                  ON r.oid = userid
    WHERE calls > 1
      AND rolname NOT LIKE '%backup'
      AND rolname  'rdsadmin'
      AND rolname  'rdsproxyadmin'
    ORDER BY 8 asc nulls last
    LIMIT 5;


According to the document hit_percent indicate how much data was fetch from the cache (shared_buffer or os kernel) vs. the total data - the higher the number, the better...

In my addition, I have total_memory_read which is the total of memory read from both disk and cache.
Here is an ouput I recieve

    |rolname   |calls|total_minutes|average_time_seconds|rows|userid|query_text            |hit_percent       |total_memory_read|
    +----------+-----+-------------+--------------------+----+------+----------------------+------------------+-----------------+
    |XXX       |8    |4.278        |32.085              |256 |20550 |SELECT some_query ... |44.915182913169814|420 GB           |
    +----------+-----+-------------+--------------------+----+------+----------------------+------------------+-----------------+

My questions:
1. Does total_memory_read is really the amount of memory this 8 calls consume? It seems quite huge to be 420G
2. If I multiple (1-hit_percent) by total_memory_read do I get the number of GB it fetch from the disk (and eventually get disk IO of ~231)?
3. Are there any other suggestions on how to track high IO hogs?
                                

Cowabunga (145 rep)

Feb 2, 2022, 09:48 PM • Last activity: Feb 5, 2022, 01:15 AM

9 votes

1 answers

6139 views

What is the performance impact of pg_stat_statements?

postgresql pg-stat-statements

I am using `pg_stat_statements` to find the slow queries in my production PostgreSQL 13. Two things: - I am unsure of the performance impact of this extension. - Is there anything I can do to improve its performance? The query below takes > 1s - any suggestions? -- add the plugin create extension pg...

                                  I am using pg_stat_statements to find the slow queries in my production PostgreSQL 13. Two things:

 - I am unsure of the performance impact of this extension. 

 - Is there anything I can do to improve its performance?  The query below takes > 1s - any suggestions?

        -- add the plugin 
        create extension pg_stat_statements;
        select pg_stat_reset();
        select pg_stat_statements_reset();

and then:
    
    -- find slow queries using the extension:

    select * from pg_stat_statements order by total_exec_time desc limit 50;

Dolphin (939 rep)

Dec 4, 2021, 06:39 AM • Last activity: Dec 7, 2021, 12:13 PM

Showing page 1 of 9 total questions