My postgresql indexes are read slowly from disk
0
votes
1
answer
140
views
I have a database of sufficient size that it does not fit entirely in RAM, including indexes that also exceed RAM capacity. When performing queries, I observe significant differences in processing time depending on whether the index needs to be read from disk or is already loaded in RAM. I have confirmed using EXPLAIN ANALYZE that the issue stems from index scans. See for example :
* https://explain.dalibo.com/plan/2c85077gagh98a17 : very slow because some part of the index is "read" (from disk) and not "hit".
* https://explain.dalibo.com/plan/gfd20f8cadaa5261#plan/node/8 : instantaneous, when everything is in RAM.
I measured the speed of loading my index into RAM during a query, which is approximately 2 MB/s. However, my infrastructure theoretically supports disk read speeds of around 900 MB/s.
This issue appears related to the index itself rather than a disk read speed cap. For instance, when I execute 2 parallel queries on different tables, the disk read speed reaches 4 MB/s. Yet, when I execute 2 parallel queries on the same table, my disk read remains at 2 MB/s.
My question is : what can I change to reach an index reading speed from disk of 900 Mo/s ?
I am working within an Azure VM environment. If additional information is required, I am available to provide it.
Environment :
* P80 disks
* Postgresql 12
* VM E16s_v3
The partitioned table mentioned in the first plan more or less follows this DDL
-- public."F_TDLJ_HIST_1" definition
-- Drop table
-- DROP TABLE public."F_TDLJ_HIST_1";
CREATE TABLE public."F_TDLJ_HIST_1" (
"ID_TRAIN" int4 NOT NULL,
"ID_JOUR" int4 NOT NULL,
"ID_LEG" int4 NOT NULL,
"JX" int4 NOT NULL,
"RES" int4 NULL,
"REV" float8 NULL,
"CAPA" int4 NULL,
"OFFRE" int4 NULL,
CONSTRAINT "F_TDLJ_HIST_1_OLDP_pkey" PRIMARY KEY ("ID_TRAIN", "ID_JOUR", "ID_LEG", "JX")
)
PARTITION BY RANGE ("ID_JOUR");
CREATE INDEX "F_TDLJ_HIST_1_OLDP_ID_JOUR_JX_idx" ON ONLY public."F_TDLJ_HIST_1" USING btree ("ID_JOUR", "JX");
CREATE INDEX "F_TDLJ_HIST_1_OLDP_ID_JOUR_idx" ON ONLY public."F_TDLJ_HIST_1" USING btree ("ID_JOUR");
CREATE INDEX "F_TDLJ_HIST_1_OLDP_ID_LEG_idx" ON ONLY public."F_TDLJ_HIST_1" USING btree ("ID_LEG");
CREATE INDEX "F_TDLJ_HIST_1_OLDP_ID_TRAIN_idx" ON ONLY public."F_TDLJ_HIST_1" USING btree ("ID_TRAIN");
CREATE INDEX "F_TDLJ_HIST_1_OLDP_JX_idx" ON ONLY public."F_TDLJ_HIST_1" USING btree ("JX");
-- public."F_TDLJ_HIST_1" foreign keys
ALTER TABLE public."F_TDLJ_HIST_1" ADD CONSTRAINT "F_TDLJ_HIST_1_OLDP_ID_JOUR_fkey" FOREIGN KEY ("ID_JOUR") REFERENCES public."D_JOUR"("ID_JOUR");
ALTER TABLE public."F_TDLJ_HIST_1" ADD CONSTRAINT "F_TDLJ_HIST_1_OLDP_ID_LEG_fkey" FOREIGN KEY ("ID_LEG") REFERENCES public."D_OD"("ID_OD");
ALTER TABLE public."F_TDLJ_HIST_1" ADD CONSTRAINT "F_TDLJ_HIST_1_OLDP_ID_TRAIN_fkey" FOREIGN KEY ("ID_TRAIN") REFERENCES public."D_TRAIN"("ID_TRAIN");
ALTER TABLE public."F_TDLJ_HIST_1" ADD CONSTRAINT "F_TDLJ_HIST_1_OLDP_JX_fkey" FOREIGN KEY ("JX") REFERENCES public."D_JX"("JX");
The first explain analyze plan is the "whole plan", I directly targeted a specific partition, under conditions on columns "ID_TRAIN", "ID_JOUR", "JX"
Here is another query and plan. I am not directly selecting "ID_TRAIN", but the final index scan is the same:
**300K via index scan, query:**
set track_io_timing=TRUE;
EXPLAIN (ANALYZE, BUFFERS, SETTINGS)(
select "ID_TRAIN", "ID_JOUR", "JX", "ID_LEG", "RES", "REV" from "F_TDLJ_HIST" fth
inner join "D_TRAIN" using ("ID_TRAIN")
inner join "D_ENTNAT" using ("ID_ENTNAT")
where "ENTITY" = 'LOIREPARIS' and "ID_JOUR" between 4770 and 4820 and "JX" between -92 and 1
);
**300K via index scan, plan:**
Nested Loop (cost=2.63..18455.74 rows=44545 width=28) (actual time=51.034..645739.307 rows=304556 loops=1)
Buffers: shared hit=86297 read=215554
I/O Timings: read=642144.906
-> Nested Loop (cost=2.07..60.13 rows=35 width=4) (actual time=0.128..1.656 rows=272 loops=1)
Buffers: shared hit=71
-> Index Scan using "UX_ENTNAT" on "D_ENTNAT" (cost=0.27..2.49 rows=1 width=4) (actual time=0.051..0.054 rows=1 loops=1)
Index Cond: (("ENTITY")::text = 'LOIREPARIS'::text)
Buffers: shared hit=3
-> Bitmap Heap Scan on "D_TRAIN" (cost=1.80..57.11 rows=53 width=8) (actual time=0.073..1.356 rows=272 loops=1)
Recheck Cond: ("ID_ENTNAT" = "D_ENTNAT"."ID_ENTNAT")
Heap Blocks: exact=65
Buffers: shared hit=68
-> Bitmap Index Scan on "fki_D_TRAIN_ID_ENTNAT_fkey" (cost=0.00..1.78 rows=53 width=0) (actual time=0.037..0.037 rows=272 loops=1)
Index Cond: ("ID_ENTNAT" = "D_ENTNAT"."ID_ENTNAT")
Buffers: shared hit=3
-> Index Scan using "F_TDLJ_HIST_p4770_pkey" on "F_TDLJ_HIST_p4770" fth (cost=0.56..436.98 rows=8861 width=28) (actual time=3.560..2373.034 rows=1120 loops=272)
Index Cond: (("ID_TRAIN" = "D_TRAIN"."ID_TRAIN") AND ("ID_JOUR" >= 4770) AND ("ID_JOUR" = '-92'::integer) AND ("JX" Hash Join (cost=950.71..6668739.11 rows=15881782 width=32) (actual time=13.909..760404.410 rows=17267213 loops=1)
Hash Cond: (fth."ID_TRAIN" = "D_TRAIN"."ID_TRAIN")
Buffers: shared hit=252180 read=5575770
I/O Timings: read=730384.999
-> Seq Scan on "F_TDLJ_HIST_1_OLDP_p4770" fth (cost=0.00..6626086.32 rows=15881782 width=28) (actual time=6.084..754077.855 rows=17267213 loops=1)
Filter: (("ID_JOUR" >= 4770) AND ("ID_JOUR" = '-92'::integer) AND ("JX" Hash (cost=772.54..772.54 rows=14254 width=8) (actual time=7.688..7.690 rows=14254 loops=1)
Buckets: 16384 Batches: 1 Memory Usage: 685kB
Buffers: shared hit=630
-> Seq Scan on "D_TRAIN" (cost=0.00..772.54 rows=14254 width=8) (actual time=0.069..5.492 rows=14254 loops=1)
Buffers: shared hit=630
-> Hash (cost=19.59..19.59 rows=408 width=4) (actual time=229.387..229.388 rows=408 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 23kB
Buffers: shared hit=12
-> Index Only Scan using "D_ENTNAT_pkey1" on "D_ENTNAT" (cost=0.27..19.59 rows=408 width=4) (actual time=0.019..0.190 rows=408 loops=1)
Heap Fetches: 0
Buffers: shared hit=12
Settings: effective_cache_size = '96GB', effective_io_concurrency = '200', max_parallel_workers_per_gather = '4', random_page_cost = '1.1', search_path = 'public', work_mem = '64MB'
Planning Time: 3.351 ms
JIT:
Functions: 17
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 1.732 ms, Inlining 62.117 ms, Optimization 97.389 ms, Emission 69.264 ms, Total 230.502 ms
Execution Time: 765721.071 ms
**20M via seq scan, query:**
set track_io_timing=TRUE;
EXPLAIN (ANALYZE, BUFFERS, SETTINGS)(
select "ID_TRAIN", "ID_JOUR", "JX", "ID_LEG", "RES", "REV" from "F_TDLJ_HIST_1" fth
inner join "D_TRAIN" using ("ID_TRAIN")
inner join "D_ENTNAT" using ("ID_ENTNAT")
where "ID_JOUR" between 4770 and 4820 and "JX" between -92 and 1
);
**20M via seq scan, plan:**
Hash Join (cost=975.41..6710886.53 rows=15881782 width=28) (actual time=243.368..764675.784 rows=17267213 loops=1)
Hash Cond: ("D_TRAIN"."ID_ENTNAT" = "D_ENTNAT"."ID_ENTNAT")
Buffers: shared hit=252195 read=5575770
I/O Timings: read=730384.999
-> Hash Join (cost=950.71..6668739.11 rows=15881782 width=32) (actual time=13.909..760404.410 rows=17267213 loops=1)
Hash Cond: (fth."ID_TRAIN" = "D_TRAIN"."ID_TRAIN")
Buffers: shared hit=252180 read=5575770
I/O Timings: read=730384.999
-> Seq Scan on "F_TDLJ_HIST_1_OLDP_p4770" fth (cost=0.00..6626086.32 rows=15881782 width=28) (actual time=6.084..754077.855 rows=17267213 loops=1)
Filter: (("ID_JOUR" >= 4770) AND ("ID_JOUR" = '-92'::integer) AND ("JX" Hash (cost=772.54..772.54 rows=14254 width=8) (actual time=7.688..7.690 rows=14254 loops=1)
Buckets: 16384 Batches: 1 Memory Usage: 685kB
Buffers: shared hit=630
-> Seq Scan on "D_TRAIN" (cost=0.00..772.54 rows=14254 width=8) (actual time=0.069..5.492 rows=14254 loops=1)
Buffers: shared hit=630
-> Hash (cost=19.59..19.59 rows=408 width=4) (actual time=229.387..229.388 rows=408 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 23kB
Buffers: shared hit=12
-> Index Only Scan using "D_ENTNAT_pkey1" on "D_ENTNAT" (cost=0.27..19.59 rows=408 width=4) (actual time=0.019..0.190 rows=408 loops=1)
Heap Fetches: 0
Buffers: shared hit=12
Settings: effective_cache_size = '96GB', effective_io_concurrency = '200', max_parallel_workers_per_gather = '4', random_page_cost = '1.1', search_path = 'public', work_mem = '64MB'
Planning Time: 3.351 ms
JIT:
Functions: 17
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 1.732 ms, Inlining 62.117 ms, Optimization 97.389 ms, Emission 69.264 ms, Total 230.502 ms
Execution Time: 765721.071 ms
Asked by Doe Jowns
(141 rep)
Jul 3, 2024, 01:39 PM
Last activity: Aug 7, 2025, 02:07 PM
Last activity: Aug 7, 2025, 02:07 PM