Sample Header Ad - 728x90

My postgresql indexes are read slowly from disk

0 votes
1 answer
140 views
I have a database of sufficient size that it does not fit entirely in RAM, including indexes that also exceed RAM capacity. When performing queries, I observe significant differences in processing time depending on whether the index needs to be read from disk or is already loaded in RAM. I have confirmed using EXPLAIN ANALYZE that the issue stems from index scans. See for example : * https://explain.dalibo.com/plan/2c85077gagh98a17 : very slow because some part of the index is "read" (from disk) and not "hit". * https://explain.dalibo.com/plan/gfd20f8cadaa5261#plan/node/8 : instantaneous, when everything is in RAM. I measured the speed of loading my index into RAM during a query, which is approximately 2 MB/s. However, my infrastructure theoretically supports disk read speeds of around 900 MB/s. This issue appears related to the index itself rather than a disk read speed cap. For instance, when I execute 2 parallel queries on different tables, the disk read speed reaches 4 MB/s. Yet, when I execute 2 parallel queries on the same table, my disk read remains at 2 MB/s. My question is : what can I change to reach an index reading speed from disk of 900 Mo/s ? I am working within an Azure VM environment. If additional information is required, I am available to provide it. Environment : * P80 disks * Postgresql 12 * VM E16s_v3 The partitioned table mentioned in the first plan more or less follows this DDL
-- public."F_TDLJ_HIST_1" definition

-- Drop table

-- DROP TABLE public."F_TDLJ_HIST_1";

CREATE TABLE public."F_TDLJ_HIST_1" (
	"ID_TRAIN" int4 NOT NULL,
	"ID_JOUR" int4 NOT NULL,
	"ID_LEG" int4 NOT NULL,
	"JX" int4 NOT NULL,
	"RES" int4 NULL,
	"REV" float8 NULL,
	"CAPA" int4 NULL,
	"OFFRE" int4 NULL,
	CONSTRAINT "F_TDLJ_HIST_1_OLDP_pkey" PRIMARY KEY ("ID_TRAIN", "ID_JOUR", "ID_LEG", "JX")
)
PARTITION BY RANGE ("ID_JOUR");
CREATE INDEX "F_TDLJ_HIST_1_OLDP_ID_JOUR_JX_idx" ON ONLY public."F_TDLJ_HIST_1" USING btree ("ID_JOUR", "JX");
CREATE INDEX "F_TDLJ_HIST_1_OLDP_ID_JOUR_idx" ON ONLY public."F_TDLJ_HIST_1" USING btree ("ID_JOUR");
CREATE INDEX "F_TDLJ_HIST_1_OLDP_ID_LEG_idx" ON ONLY public."F_TDLJ_HIST_1" USING btree ("ID_LEG");
CREATE INDEX "F_TDLJ_HIST_1_OLDP_ID_TRAIN_idx" ON ONLY public."F_TDLJ_HIST_1" USING btree ("ID_TRAIN");
CREATE INDEX "F_TDLJ_HIST_1_OLDP_JX_idx" ON ONLY public."F_TDLJ_HIST_1" USING btree ("JX");


-- public."F_TDLJ_HIST_1" foreign keys

ALTER TABLE public."F_TDLJ_HIST_1" ADD CONSTRAINT "F_TDLJ_HIST_1_OLDP_ID_JOUR_fkey" FOREIGN KEY ("ID_JOUR") REFERENCES public."D_JOUR"("ID_JOUR");
ALTER TABLE public."F_TDLJ_HIST_1" ADD CONSTRAINT "F_TDLJ_HIST_1_OLDP_ID_LEG_fkey" FOREIGN KEY ("ID_LEG") REFERENCES public."D_OD"("ID_OD");
ALTER TABLE public."F_TDLJ_HIST_1" ADD CONSTRAINT "F_TDLJ_HIST_1_OLDP_ID_TRAIN_fkey" FOREIGN KEY ("ID_TRAIN") REFERENCES public."D_TRAIN"("ID_TRAIN");
ALTER TABLE public."F_TDLJ_HIST_1" ADD CONSTRAINT "F_TDLJ_HIST_1_OLDP_JX_fkey" FOREIGN KEY ("JX") REFERENCES public."D_JX"("JX");
The first explain analyze plan is the "whole plan", I directly targeted a specific partition, under conditions on columns "ID_TRAIN", "ID_JOUR", "JX" Here is another query and plan. I am not directly selecting "ID_TRAIN", but the final index scan is the same: **300K via index scan, query:**
set track_io_timing=TRUE;
EXPLAIN (ANALYZE, BUFFERS, SETTINGS)(
    select "ID_TRAIN", "ID_JOUR", "JX", "ID_LEG", "RES", "REV" from "F_TDLJ_HIST" fth
    inner join "D_TRAIN" using ("ID_TRAIN")
    inner join "D_ENTNAT" using ("ID_ENTNAT")
    where "ENTITY" = 'LOIREPARIS' and "ID_JOUR" between 4770 and 4820 and "JX" between -92 and 1
);
**300K via index scan, plan:**
Nested Loop  (cost=2.63..18455.74 rows=44545 width=28) (actual time=51.034..645739.307 rows=304556 loops=1)
  Buffers: shared hit=86297 read=215554
  I/O Timings: read=642144.906
  ->  Nested Loop  (cost=2.07..60.13 rows=35 width=4) (actual time=0.128..1.656 rows=272 loops=1)
        Buffers: shared hit=71
        ->  Index Scan using "UX_ENTNAT" on "D_ENTNAT"  (cost=0.27..2.49 rows=1 width=4) (actual time=0.051..0.054 rows=1 loops=1)
              Index Cond: (("ENTITY")::text = 'LOIREPARIS'::text)
              Buffers: shared hit=3
        ->  Bitmap Heap Scan on "D_TRAIN"  (cost=1.80..57.11 rows=53 width=8) (actual time=0.073..1.356 rows=272 loops=1)
              Recheck Cond: ("ID_ENTNAT" = "D_ENTNAT"."ID_ENTNAT")
              Heap Blocks: exact=65
              Buffers: shared hit=68
              ->  Bitmap Index Scan on "fki_D_TRAIN_ID_ENTNAT_fkey"  (cost=0.00..1.78 rows=53 width=0) (actual time=0.037..0.037 rows=272 loops=1)
                    Index Cond: ("ID_ENTNAT" = "D_ENTNAT"."ID_ENTNAT")
                    Buffers: shared hit=3
  ->  Index Scan using "F_TDLJ_HIST_p4770_pkey" on "F_TDLJ_HIST_p4770" fth  (cost=0.56..436.98 rows=8861 width=28) (actual time=3.560..2373.034 rows=1120 loops=272)
        Index Cond: (("ID_TRAIN" = "D_TRAIN"."ID_TRAIN") AND ("ID_JOUR" >= 4770) AND ("ID_JOUR" = '-92'::integer) AND ("JX"   Hash Join  (cost=950.71..6668739.11 rows=15881782 width=32) (actual time=13.909..760404.410 rows=17267213 loops=1)
        Hash Cond: (fth."ID_TRAIN" = "D_TRAIN"."ID_TRAIN")
        Buffers: shared hit=252180 read=5575770
        I/O Timings: read=730384.999
        ->  Seq Scan on "F_TDLJ_HIST_1_OLDP_p4770" fth  (cost=0.00..6626086.32 rows=15881782 width=28) (actual time=6.084..754077.855 rows=17267213 loops=1)
              Filter: (("ID_JOUR" >= 4770) AND ("ID_JOUR" = '-92'::integer) AND ("JX"   Hash  (cost=772.54..772.54 rows=14254 width=8) (actual time=7.688..7.690 rows=14254 loops=1)
              Buckets: 16384  Batches: 1  Memory Usage: 685kB
              Buffers: shared hit=630
              ->  Seq Scan on "D_TRAIN"  (cost=0.00..772.54 rows=14254 width=8) (actual time=0.069..5.492 rows=14254 loops=1)
                    Buffers: shared hit=630
  ->  Hash  (cost=19.59..19.59 rows=408 width=4) (actual time=229.387..229.388 rows=408 loops=1)
        Buckets: 1024  Batches: 1  Memory Usage: 23kB
        Buffers: shared hit=12
        ->  Index Only Scan using "D_ENTNAT_pkey1" on "D_ENTNAT"  (cost=0.27..19.59 rows=408 width=4) (actual time=0.019..0.190 rows=408 loops=1)
              Heap Fetches: 0
              Buffers: shared hit=12
Settings: effective_cache_size = '96GB', effective_io_concurrency = '200', max_parallel_workers_per_gather = '4', random_page_cost = '1.1', search_path = 'public', work_mem = '64MB'
Planning Time: 3.351 ms
JIT:
  Functions: 17
  Options: Inlining true, Optimization true, Expressions true, Deforming true
  Timing: Generation 1.732 ms, Inlining 62.117 ms, Optimization 97.389 ms, Emission 69.264 ms, Total 230.502 ms
Execution Time: 765721.071 ms
**20M via seq scan, query:**
set track_io_timing=TRUE; 
EXPLAIN (ANALYZE, BUFFERS, SETTINGS)(
    select "ID_TRAIN", "ID_JOUR", "JX", "ID_LEG", "RES", "REV" from "F_TDLJ_HIST_1" fth
    inner join "D_TRAIN" using ("ID_TRAIN")
    inner join "D_ENTNAT" using ("ID_ENTNAT")
    where "ID_JOUR" between 4770 and 4820 and "JX" between -92 and 1
);
**20M via seq scan, plan:**
Hash Join  (cost=975.41..6710886.53 rows=15881782 width=28) (actual time=243.368..764675.784 rows=17267213 loops=1)
  Hash Cond: ("D_TRAIN"."ID_ENTNAT" = "D_ENTNAT"."ID_ENTNAT")
  Buffers: shared hit=252195 read=5575770
  I/O Timings: read=730384.999
  ->  Hash Join  (cost=950.71..6668739.11 rows=15881782 width=32) (actual time=13.909..760404.410 rows=17267213 loops=1)
        Hash Cond: (fth."ID_TRAIN" = "D_TRAIN"."ID_TRAIN")
        Buffers: shared hit=252180 read=5575770
        I/O Timings: read=730384.999
        ->  Seq Scan on "F_TDLJ_HIST_1_OLDP_p4770" fth  (cost=0.00..6626086.32 rows=15881782 width=28) (actual time=6.084..754077.855 rows=17267213 loops=1)
              Filter: (("ID_JOUR" >= 4770) AND ("ID_JOUR" = '-92'::integer) AND ("JX"   Hash  (cost=772.54..772.54 rows=14254 width=8) (actual time=7.688..7.690 rows=14254 loops=1)
              Buckets: 16384  Batches: 1  Memory Usage: 685kB
              Buffers: shared hit=630
              ->  Seq Scan on "D_TRAIN"  (cost=0.00..772.54 rows=14254 width=8) (actual time=0.069..5.492 rows=14254 loops=1)
                    Buffers: shared hit=630
  ->  Hash  (cost=19.59..19.59 rows=408 width=4) (actual time=229.387..229.388 rows=408 loops=1)
        Buckets: 1024  Batches: 1  Memory Usage: 23kB
        Buffers: shared hit=12
        ->  Index Only Scan using "D_ENTNAT_pkey1" on "D_ENTNAT"  (cost=0.27..19.59 rows=408 width=4) (actual time=0.019..0.190 rows=408 loops=1)
              Heap Fetches: 0
              Buffers: shared hit=12
Settings: effective_cache_size = '96GB', effective_io_concurrency = '200', max_parallel_workers_per_gather = '4', random_page_cost = '1.1', search_path = 'public', work_mem = '64MB'
Planning Time: 3.351 ms
JIT:
  Functions: 17
  Options: Inlining true, Optimization true, Expressions true, Deforming true
  Timing: Generation 1.732 ms, Inlining 62.117 ms, Optimization 97.389 ms, Emission 69.264 ms, Total 230.502 ms
Execution Time: 765721.071 ms
Asked by Doe Jowns (141 rep)
Jul 3, 2024, 01:39 PM
Last activity: Aug 7, 2025, 02:07 PM