Sample Header Ad - 728x90

Database Administrators

Q&A for database professionals who wish to improve their database skills

Latest Questions

4 votes
1 answers
1946 views
Postgres 13/14 JIT execution increase execution time of query considerably
I have a complex DB query that I run on different systems with identical specs, but different OS (CentOS7 vs. Debian 11). The Postgres version is 13.7, but I also tried with 14.4. I get vastly different execution times for the query on both systems ( 9s). CentOS (removed all never executed blocks fo...
I have a complex DB query that I run on different systems with identical specs, but different OS (CentOS7 vs. Debian 11). The Postgres version is 13.7, but I also tried with 14.4. I get vastly different execution times for the query on both systems (9s). CentOS (removed all never executed blocks for brevity):
[
  {
    "Plan": {
      "Node Type": "Nested Loop",
      "Parallel Aware": false,
      "Join Type": "Inner",
      "Startup Cost": 34432047.36,
      "Total Cost": 34432049.49,
      "Plan Rows": 1,
      "Plan Width": 32,
      "Actual Startup Time": 0.012,
      "Actual Total Time": 0.034,
      "Actual Rows": 0,
      "Actual Loops": 1,
      "Output": [...],
      "Inner Unique": false,
      "Shared Hit Blocks": 1,
      "Shared Read Blocks": 0,
      "Shared Dirtied Blocks": 0,
      "Shared Written Blocks": 0,
      "Local Hit Blocks": 0,
      "Local Read Blocks": 0,
      "Local Dirtied Blocks": 0,
      "Local Written Blocks": 0,
      "Temp Read Blocks": 0,
      "Temp Written Blocks": 0,
      "Plans": [
        {
          "Node Type": "Nested Loop",
          "Parent Relationship": "Outer",
          "Parallel Aware": false,
          "Join Type": "Left",
          "Startup Cost": 92.30,
          "Total Cost": 94.39,
          "Plan Rows": 1,
          "Plan Width": 88,
          "Actual Startup Time": 0.012,
          "Actual Total Time": 0.027,
          "Actual Rows": 0,
          "Actual Loops": 1,
          "Output": [...],
          "Inner Unique": false,
          "Join Filter": "(acg.id = \"activityConflictGroup\".id)",
          "Rows Removed by Join Filter": 0,
          "Shared Hit Blocks": 1,
          "Shared Read Blocks": 0,
          "Shared Dirtied Blocks": 0,
          "Shared Written Blocks": 0,
          "Local Hit Blocks": 0,
          "Local Read Blocks": 0,
          "Local Dirtied Blocks": 0,
          "Local Written Blocks": 0,
          "Temp Read Blocks": 0,
          "Temp Written Blocks": 0,
          "Plans": [
            {
              "Node Type": "Seq Scan",
              "Parent Relationship": "Outer",
              "Parallel Aware": false,
              "Relation Name": "activity_conflict_group",
              "Schema": "production",
              "Alias": "activityConflictGroup",
              "Startup Cost": 0.00,
              "Total Cost": 1.01,
              "Plan Rows": 1,
              "Plan Width": 56,
              "Actual Startup Time": 0.012,
              "Actual Total Time": 0.012,
              "Actual Rows": 0,
              "Actual Loops": 1,
              "Output": [...],
              "Filter": "(\"activityConflictGroup\".state = 'ACTIVE'::text)",
              "Rows Removed by Filter": 1,
              "Shared Hit Blocks": 1,
              "Shared Read Blocks": 0,
              "Shared Dirtied Blocks": 0,
              "Shared Written Blocks": 0,
              "Local Hit Blocks": 0,
              "Local Read Blocks": 0,
              "Local Dirtied Blocks": 0,
              "Local Written Blocks": 0,
              "Temp Read Blocks": 0,
              "Temp Written Blocks": 0
               ...
    "Planning": {
      "Shared Hit Blocks": 4428,
      "Shared Read Blocks": 0,
      "Shared Dirtied Blocks": 0,
      "Shared Written Blocks": 0,
      "Local Hit Blocks": 0,
      "Local Read Blocks": 0,
      "Local Dirtied Blocks": 0,
      "Local Written Blocks": 0,
      "Temp Read Blocks": 0,
      "Temp Written Blocks": 0
    },
    "Planning Time": 60.923,
    "Triggers": [
    ],
    "Execution Time": 2.081
  }
]
Debian (removed all never executed blocks for brevity):
[
  {
    "Plan": {
      "Node Type": "Nested Loop",
      "Parallel Aware": false,
      "Join Type": "Inner",
      "Startup Cost": 13844811.29,
      "Total Cost": 13844814.63,
      "Plan Rows": 1,
      "Plan Width": 32,
      "Actual Startup Time": 9462.994,
      "Actual Total Time": 9463.419,
      "Actual Rows": 0,
      "Actual Loops": 1,
      "Output": [...],
      "Inner Unique": false,
      "Shared Hit Blocks": 1,
      "Shared Read Blocks": 0,
      "Shared Dirtied Blocks": 0,
      "Shared Written Blocks": 0,
      "Local Hit Blocks": 0,
      "Local Read Blocks": 0,
      "Local Dirtied Blocks": 0,
      "Local Written Blocks": 0,
      "Temp Read Blocks": 0,
      "Temp Written Blocks": 0,
      "Plans": [
        {
          "Node Type": "Nested Loop",
          "Parent Relationship": "Outer",
          "Parallel Aware": false,
          "Join Type": "Left",
          "Startup Cost": 268.43,
          "Total Cost": 271.73,
          "Plan Rows": 1,
          "Plan Width": 72,
          "Actual Startup Time": 9462.992,
          "Actual Total Time": 9463.058,
          "Actual Rows": 0,
          "Actual Loops": 1,
          "Output": [...],
          "Inner Unique": false,
          "Join Filter": "(acg.id = \"activityConflictGroup\".id)",
          "Rows Removed by Join Filter": 0,
          "Shared Hit Blocks": 1,
          "Shared Read Blocks": 0,
          "Shared Dirtied Blocks": 0,
          "Shared Written Blocks": 0,
          "Local Hit Blocks": 0,
          "Local Read Blocks": 0,
          "Local Dirtied Blocks": 0,
          "Local Written Blocks": 0,
          "Temp Read Blocks": 0,
          "Temp Written Blocks": 0,
          "Plans": [
            {
              "Node Type": "Seq Scan",
              "Parent Relationship": "Outer",
              "Parallel Aware": false,
              "Relation Name": "activity_conflict_group",
              "Schema": "production",
              "Alias": "activityConflictGroup",
              "Startup Cost": 0.00,
              "Total Cost": 1.01,
              "Plan Rows": 1,
              "Plan Width": 40,
              "Actual Startup Time": 9462.990,
              "Actual Total Time": 9462.991,
              "Actual Rows": 0,
              "Actual Loops": 1,
              "Output": [...],
              "Filter": "(\"activityConflictGroup\".state = 'ACTIVE'::text)",
              "Rows Removed by Filter": 1,
              "Shared Hit Blocks": 1,
              "Shared Read Blocks": 0,
              "Shared Dirtied Blocks": 0,
              "Shared Written Blocks": 0,
              "Local Hit Blocks": 0,
              "Local Read Blocks": 0,
              "Local Dirtied Blocks": 0,
              "Local Written Blocks": 0,
              "Temp Read Blocks": 0,
              "Temp Written Blocks": 0
            },
            ...
    "Planning": {
      "Shared Hit Blocks": 5017,
      "Shared Read Blocks": 199,
      "Shared Dirtied Blocks": 0,
      "Shared Written Blocks": 0,
      "Local Hit Blocks": 0,
      "Local Read Blocks": 0,
      "Local Dirtied Blocks": 0,
      "Local Written Blocks": 0,
      "Temp Read Blocks": 0,
      "Temp Written Blocks": 0
    },
    "Planning Time": 136.644,
    "Triggers": [
    ],
    "JIT": {
      "Functions": 1359,
      "Options": {
        "Inlining": true,
        "Optimization": true,
        "Expressions": true,
        "Deforming": true
      },
      "Timing": {
        "Generation": 129.432,
        "Inlining": 61.619,
        "Optimization": 5772.102,
        "Emission": 3625.487,
        "Total": 9588.641
      }
    },
    "Execution Time": 9635.694
  }
]
The main difference I see is that the Debian query is using JIT compilation. Both systems use the standard PG 13.7 settings and so conf key jit is set to on. However, SELECT pg_jit_available(); show that JIT is only available on the Debian system. It took me a while to find this out. The plan shows that JIT is adding roughly 9.5s to the whole query time. My "workaround" is to turn off JIT for the server. This really confuses me. Is this a number I have to expect or is the JIT setup on Debian messed up? Maybe I'm looking at an edge case here, because the query is very complex on paper, but turns out to be very simple because of the data in the DB. So my question is, if this is an expected behaviour and what to do about it to get it faster. My workaround of turning JIT off seems to be not optimal.
spa (223 rep)
Jul 21, 2022, 02:40 PM • Last activity: Jul 26, 2022, 09:20 AM
2 votes
0 answers
698 views
pg_jit_available showing 'false' and jit not working, but config file seems set up properly
I am trying to turn on JIT for testing purposes on a Postgresql 12 server, it was off before. My postgresql.conf has the following properties activated (basically all default values): - jit = on - jit_above_cost = 100000 - jit_inline_above_cost=500000 - jit_optimize_above_cost=500000 - jit_provider...
I am trying to turn on JIT for testing purposes on a Postgresql 12 server, it was off before. My postgresql.conf has the following properties activated (basically all default values): - jit = on - jit_above_cost = 100000 - jit_inline_above_cost=500000 - jit_optimize_above_cost=500000 - jit_provider = 'llvmjit' I have also restarted the Postgres server service after editing the config. Now when I run show jit I get on, however running SELECT pg_jit_available() returns false. From the test queries I made it does not seem to include any JIT stuff happening in the query plan neither. [The documentation](https://www.postgresql.org/docs/11/jit-configuration.html) is a bit scarce on how configuration is done properly - is there anything I have missed about my setup? Thanks!
bakamitai666 (21 rep)
Aug 18, 2021, 03:17 PM
17 votes
1 answers
13944 views
Handling performance problems with jit in postgres 12
**tl;dr:** What is the best way to handle performance decrease caused by jit? **Background**: Lately I have migrated from postgres 11 to 12 and noticed, that some queries / procedures run considerably slower. I did some research and tested different configurations - it turned out, that the jit overh...
**tl;dr:** What is the best way to handle performance decrease caused by jit? **Background**: Lately I have migrated from postgres 11 to 12 and noticed, that some queries / procedures run considerably slower. I did some research and tested different configurations - it turned out, that the jit overhead exceeds its gains. I see two possible solutions: 1. Disable jit permanently for all queries by running set jit = off; once. This helps, but does not feel natural. (Maybe because of changing the default configuration, which I did not much as there was no such need. The defaults are sane and there is low probability they need to be changed.) This would also mean throwing away possible performance gains when running long queries. 2. Disabling jit at the beginning of the procedure and enabling it at the end. I have objections about the impact of changing the configuration on queries running in parallel. Of course this can harm performance of log queries, that would run alongside, but maybe this can cause other, possibly hard to detect bugs. Also this seems even less right than the first solution - changing the configuration as a part of the procedure. The best way to handle this would be by fine-tuning the configuration, so that jit is not discarded in its entirety, but also configuration is not changed at runtime. Unfortunately I found little to no resource on how that would be possible (only parameters like jit_above_cost in the documentation that seem to be of little help). **Questions**: What is the best way to solve this issue? Are there any criteria about when to disable jit except for running and measuring? Can the behavior be fine-tuned (for example jit disabled for specific queries)? Edit: I'm attaching a plan with and without jit. The plans are too big to post here.
Andronicus (271 rep)
Apr 14, 2020, 11:38 AM • Last activity: Apr 14, 2020, 03:29 PM
4 votes
2 answers
9356 views
Queries on large database kill connection to the server, works with LIMIT
I'm trying to run queries on a large-ish database without killing the connection to the server. I'm using Postgres 12.1 on a mac with 16gb of memory, and about 40gb of free disk. The database is 78gb according to `pg_database_size` with the largest table being 20gb according do `pg_total_relation_si...
I'm trying to run queries on a large-ish database without killing the connection to the server. I'm using Postgres 12.1 on a mac with 16gb of memory, and about 40gb of free disk. The database is 78gb according to pg_database_size with the largest table being 20gb according do pg_total_relation_size. The error I get (from the log), regardless of which non-working query I run, is:
server process (PID xxx) was terminated by signal 9: Killed: 9
In VS code the error is "lost connection to server". Two examples that don't work are:
UPDATE table
SET column = NULL
WHERE column = 0;
select columnA
from table1
where columnA NOT IN (
select columnB
from table2
);
I can run some of the queries (the above one, for example) by adding a LIMIT of, say, 1,000,000. I suspected that I was running out of disk due to temp files, but in the log (with log_temp_files = 0), I can't see any temp files being written. I tried increasing and decreasing work_mem, maintenance_work_mem, shared_buffers, and temp_buffers. None worked, the performance was about the same. I tried dropping all indexes, which brought down the "cost" on some of the queries, but they still killed the connection to the server. What could be my problem and how can I troubleshoot this further? Additionally, I [read](https://www.postgresql-archive.org/Removing-pgsql-tmp-files-td3255748.html) that temp files from timed-out queries are stored in pqsql_tmp. I checked the folder, and it does not have files of significant size. Could the temp files be stored somewhere else? --- The posgtres log for running a failed query looks like:
2020-02-17 09:31:08.626 CET  LOG:  server process (PID xxx) was terminated by signal 9: Killed: 9
2020-02-17 09:31:08.626 CET  DETAIL:  Failed process was running: update table
        set columnname = NULL
        where columnname = 0;

2020-02-17 09:31:08.626 CET  LOG:  terminating any other active server processes
2020-02-17 09:31:08.626 CET  WARNING:  terminating connection because of crash of another server process
2020-02-17 09:31:08.626 CET  DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exi$
2020-02-17 09:31:08.626 CET  HINT:  In a moment you should be able to reconnect to the database and repeat your command.
2020-02-17 09:31:08.626 CET  WARNING:  terminating connection because of crash of another server process
2020-02-17 09:31:08.626 CET  DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exi$
2020-02-17 09:31:08.626 CET  HINT:  In a moment you should be able to reconnect to the database and repeat your command.
2020-02-17 09:31:08.629 CET  LOG:  all server processes terminated; reinitializing
2020-02-17 09:31:08.698 CET  LOG:  database system was interrupted; last known up at 2020-02-17 09:30:57 CET
2020-02-17 09:31:08.901 CET  LOG:  database system was not properly shut down; automatic recovery in progress
2020-02-17 09:31:08.906 CET  LOG:  invalid record length at 17/894C438: wanted 24, got 0
2020-02-17 09:31:08.906 CET  LOG:  redo is not required
--- Running EXPLAIN on the second example query returns:
Seq Scan on gas_prices_all p  (cost=459.93..5635583.33 rows=128975016 width=16)
  Filter: (NOT (hashed SubPlan 1))
  SubPlan 1
    ->  Seq Scan on gas_station g  (cost=0.00..423.14 rows=14714 width=16)
JIT:
  Functions: 13
  Options: Inlining true, Optimization true, Expressions true, Deforming true
FWIW I get the same error for queries with "cost" around zero. --- Update: I found a crash report from earlier today:
Process:               postgres 
Path:                  /Users/USER/*/postgres
Identifier:            postgres
Version:               0
Code Type:             X86-64 (Native)
Parent Process:        postgres 
Responsible:           postgres 
User ID:               502

Date/Time:             2020-02-18 11:16:26.210 +0100
OS Version:            Mac OS X 10.14.5 (18F132)
Report Version:        12
Anonymous UUID:        F41CCD21-C558-6CB0-316D-D1FF3E279576

Sleep/Wake UUID:       5F08EAEC-373A-4D19-A243-E812E68D2697

Time Awake Since Boot: 1600000 seconds
Time Since Wake:       5700 seconds

System Integrity Protection: enabled

Crashed Thread:        0  Dispatch queue: com.apple.main-thread

Exception Type:        EXC_BAD_ACCESS (Code Signature Invalid)
Exception Codes:       0x0000000000000032, 0x00000001044c4060
Exception Note:        EXC_CORPSE_NOTIFY

Termination Reason:    Namespace CODESIGNING, Code 0x2

kernel messages:

VM Regions Near 0x1044c4060:
    __LINKEDIT             0000000104466000-00000001044c4000 [  376K] r--/rw- SM=COW  /Users/USER/*/*.dylib
--> VM_ALLOCATE            00000001044c4000-00000001044c5000 [    4K] r-x/rwx SM=ZER  
    VM_ALLOCATE            00000001044c5000-00000001044c6000 [    4K] rw-/rwx SM=ZER  

Application Specific Information:
crashed on child side of fork pre-exec

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0   ???                           	0x00000001044c4060 0 + 4367073376
1   postgres                      	0x0000000103798851 ExecAgg + 1185 (executor.h:308)
2   postgres                      	0x0000000103785d0f standard_ExecutorRun + 287 (execMain.c:1652)
3   postgres                      	0x0000000103789c4e ParallelQueryMain + 670 (execParallel.c:1405)
4   postgres                      	0x00000001036520ce ParallelWorkerMain + 1054 (parallel.c:1434)
5   postgres                      	0x000000010385bec5 StartBackgroundWorker + 533 (bgworker.c:834)
6   postgres                      	0x000000010386acb9 maybe_start_bgworkers + 1161
7   postgres                      	0x00000001038696c5 sigusr1_handler + 357 (postmaster.c:5167)
8   libsystem_platform.dylib      	0x00007fff76195b5d _sigtramp + 29
9   ???                           	0x0000000000003200 0 + 12800
10  postgres                      	0x00000001037d54ae main + 1678
11  libdyld.dylib                 	0x00007fff75faa3d5 start + 1

Thread 0 crashed with X86 Thread State (64-bit):
  rax: 0x00000001044c4060  rbx: 0x00007f98b9045600  rcx: 0x00000001072c48d8  rdx: 0x00007ffeec6242f4
  rdi: 0x00007f98b9047520  rsi: 0x00007f98b9045fc0  rbp: 0x00007ffeec624320  rsp: 0x00007ffeec624288
   r8: 0x00000000014bafca   r9: 0xffffffff00000000  r10: 0x00000001072c48d0  r11: 0x0000000000000005
  r12: 0x0000000103c51220  r13: 0x00007f98b9047510  r14: 0x00007f98b9045fc0  r15: 0x00007f98b90459a0
  rip: 0x00000001044c4060  rfl: 0x0000000000010246  cr2: 0x00000001044c4060
  
Logical CPU:     0
Error Code:      0x00000015
Trap Number:     14


Binary Images:
      **lots of stuff**

External Modification Summary:
  Calls made by other processes targeting this process:
    task_for_pid: 0
    thread_create: 0
    thread_set_state: 0
  Calls made by this process:
    task_for_pid: 0
    thread_create: 0
    thread_set_state: 0
  Calls made by all processes on this machine:
    task_for_pid: 134420121
    thread_create: 0
    thread_set_state: 0

VM Region Summary:
ReadOnly portion of Libraries: Total=398.8M resident=0K(0%) swapped_out_or_unallocated=398.8M(100%)
Writable regions: Total=4.2G written=0K(0%) resident=0K(0%) swapped_out=0K(0%) unallocated=4.2G(100%)
 
                                VIRTUAL   REGION 
REGION TYPE                        SIZE    COUNT (non-coalesced) 
===========                     =======  ======= 
Kernel Alloc Once                    8K        1 
MALLOC                            81.0M       19 
MALLOC guard page                   16K        3 
MALLOC_LARGE (reserved)             72K        2         reserved VM address space (unallocated)
STACK GUARD                       56.0M        1 
Stack                             8192K        1 
VM_ALLOCATE                        4.1G        4 
__DATA                            18.0M      200 
__FONT_DATA                          4K        1 
__LINKEDIT                       232.2M       11 
__TEXT                           166.6M      199 
__UNICODE                          564K        1 
mapped file                         64K        1 
shared memory                      552K        6 
===========                     =======  ======= 
TOTAL                              4.7G      450 
TOTAL, minus reserved VM space     4.7G      450
I can find two maybe-interesting things in the log:
2020-02-18 19:01:52.044375+0100  localhost kernel: CODE SIGNING: process 51528[postgres]: rejecting invalid page at address 0x1100c1000 from offset 0x0 in file "" (cs_mtime:0.0 == mtime:0.0) (signed:0 validated:0 tainted:0 nx:0 wpmapped:1 dirty:0 depth:0)
2020-02-18 19:01:52.044805+0100  localhost ReportCrash: unknown nested kcdata type: 0x1004, size: 2108
2020-02-18 19:01:55.268060+0100  localhost ReportCrash: (CrashReporterSupport) Saved crash report for postgres version 0 to postgres_2020-02-18-190155_MacBook-Pro.crash
2020-02-18 19:01:55.273159+0100  localhost ReportCrash: (CrashReporterSupport) Removing excessive log: postgres_2020-02-18-190155_MacBook-Pro.crash
2020-02-18 19:01:55.274208+0100  localhost ReportCrash: shouldDisplayUnexpectedlyQuitNotification is NO
--- UPDATE: I ran the second exaple query (above) with LIMIT until it went from working (LIMIT of 2,200,000) to killing the connection to the server (LIMIT of 2,300,000). EXPLAIN ANALYZE on the LIMIT 2,200,000 query is:
Limit  (cost=459.93..96581.42 rows=2200000 width=16) (actual time=13.228..38573.440 rows=2200000 loops=1)
  ->  Seq Scan table1  (cost=459.93..5635583.33 rows=128975016 width=16) (actual time=13.227..38374.070 rows=2200000 loops=1)
        Filter: (NOT (hashed SubPlan 1))
        Rows Removed by Filter: 139729529
        SubPlan 1
          ->  Seq Scan on table2  (cost=0.00..423.14 rows=14714 width=16) (actual time=0.350..6.925 rows=14714 loops=1)
Planning Time: 0.138 ms
Execution Time: 38685.762 ms
EXPLAIN on the LIMIT 2,300,000 is: (EXPLAIN ANALYZE crashed here)
Limit  (cost=459.93..100950.58 rows=2300000 width=16)
  ->  Seq Scan on table1  (cost=459.93..5635583.33 rows=128975016 width=16)
        Filter: (NOT (hashed SubPlan 1))
        SubPlan 1
          ->  Seq Scan on table2  (cost=0.00..423.14 rows=14714 width=16)
JIT:
  Functions: 14
  Options: Inlining false, Optimization false, Expressions true, Deforming true
I'm reading this as JIT kicking in here, which makes sense given jit_above_cost is set (default) to 100,000. So perhaps the problem is with the JIT somehow? --- UPDATE 2: With jit=off the second example query works and is also ca. twice as fast. How can that be and what could cause JIT to be a problem on my system?
doeeehunt (51 rep)
Feb 18, 2020, 01:18 PM • Last activity: Feb 27, 2020, 03:31 PM
9 votes
1 answers
3187 views
How to tell if Postgres was built with the option to include LLVM technology for JIT?
[Postgres][1] 11 gained the [ability][2] to perform [Just-in-Time Compilation (JIT)][3] based on [LLVM][4] technology. The [manual notes][5] that this feature depends on Postgres having been built with the [`--with-llvm`][6] option. ➥ How can one verify their build of Postgres includes LLVM? Assumin...
Postgres 11 gained the ability to perform Just-in-Time Compilation (JIT) based on LLVM technology. The manual notes that this feature depends on Postgres having been built with the --with-llvm option. ➥ How can one verify their build of Postgres includes LLVM? Assuming LLVM is installed… In Postgres 11, JIT is disabled by default. In Postgres 12, the feature is enabled. That raises the question: ➥ How can one tell if JIT is currently enabled or disabled?
Basil Bourque (11188 rep)
Jul 25, 2019, 05:07 AM • Last activity: Jul 27, 2019, 03:39 AM
Showing page 1 of 5 total questions