Sample Header Ad - 728x90

How to make PostgreSQL take advantage of SSD RAID?

2 votes
0 answers
60 views
I'm using a PG13 server as a feature DB. There is no else job on the server but this PG instance and a GPU based machine learning process. There is no online transaction, and the data lossing or incorrect are not trouble, I just need it run faster. Few days ago, with a lot of gentlemen's help in my another question , I have corrected some mis-configurations of my PG server. But the throughput of the PG server delivering data still can **NOT** satisfy the reuqiement of the ML process. I need the PG server to produce data as fast as possible, or in other word, to load data from disk more aggressively. There are two physical CPU(as two numa nodes), and each of them has 128 real cores(I disabled Super Threads in BIOS setup), 128GB memory(I splited 64GB dedicated for PG). Since whole of CPU1 has been assigned dedicated for PG use, PG is configed with 128 parallel workers. The result is **NOT** as my expecting, the throughput still lower. The bandwidth of the SSD RAID is 10GB/s, but the actual reading/writing rate of PG is 100~200MB/s, only like a SATA hard disk. I don't believe it has only so poor performance, because I did some experiments/observing as fowllowing: 1. Directly read/write raw file by means of dd command, and count the io rate. I even writed a C++ program that starts multi-threads directly dump/load data to/from files on the RAID. The io rate can easily continuously exceed 10GB/s viewed by iotop command. So I think that the configurations of the hardware, driver soft and the OS(Debian12) should be ok; 2. By watching the output of top/numastat, I found that most PG workers mearly use very less of CPU/mem resources, even though there are 128 hungry clients awaiting data. If they(PG workers) are waiting for io, the RAID should be very busy. But I neven observed the io rate exceed 300MB/s when PG working. If they(PG workers) are doing some DB operations such as sorting/filtering/aggregating, should I see a higher usage percentage of CPU1? Neither io rate nor CPU usage are high, what are they doing when 128 parallel clients are running and constantly doing a simple query(with different filter arguments)? 3. I have tried to limit all processes of PG to run at a same numa node by means of numactl --cpunodebind 1 --membind 1 -- ..., in order to prevent "CPU cache ping-pong" occursing between numa nodes. I also limited the client processes run only at CPU0, to prevent them from contention CPU1. But the throughput didn't improve. 4. I tried different values of option effective_io_concurrency in postgresql.conf, such as 1, 256, 512, even 1000, but the results are similar. How to make my PG server really busy up? Forgive my ugly English, I hope I expressed things correctly. Thanks!
Asked by Leon (411 rep)
Mar 26, 2025, 09:11 AM
Last activity: Mar 26, 2025, 09:46 AM