pgbench - 20-30% variation in benchmark results (non-repeatable benchmarks)
1
vote
1
answer
1046
views
## Problem
I'm trying to use
pgbench
to help me tune Postgres for my hardware, but I'm seeing a very strange behaviour. I cannot seem to get a stable number for TPS in consecutive runs of pgbench
. Since I was running pgbench
for 60 secs, I _assumed_ that this is because of checkpointing and auto-vacuuming, so I made the following changes to my config:
- autovacuum=off
- max_wal_size=5GB
But this led to even more wildly varying numbers! For example, here's the output of the exact same pgbench
command run consecutively:
**Output 1**
starting vacuum...end.
progress: 5.0 s, 566.0 tps, lat 10.577 ms stddev 2.788
progress: 10.0 s, 513.0 tps, lat 11.689 ms stddev 2.907
progress: 15.0 s, 513.8 tps, lat 11.680 ms stddev 2.995
progress: 20.0 s, 519.6 tps, lat 11.546 ms stddev 2.969
progress: 25.0 s, 518.4 tps, lat 11.576 ms stddev 2.929
progress: 30.0 s, 518.2 tps, lat 11.576 ms stddev 2.978
progress: 35.0 s, 522.8 tps, lat 11.472 ms stddev 2.966
progress: 40.0 s, 521.0 tps, lat 11.516 ms stddev 2.962
progress: 45.0 s, 521.2 tps, lat 11.510 ms stddev 2.909
progress: 50.0 s, 581.6 tps, lat 10.313 ms stddev 2.636
progress: 55.0 s, 520.8 tps, lat 11.526 ms stddev 2.919
progress: 60.0 s, 522.2 tps, lat 11.494 ms stddev 2.927
transaction type:
scaling factor: 2000
query mode: simple
number of clients: 6
number of threads: 6
duration: 60 s
number of transactions actually processed: 31699
latency average = 11.357 ms
latency stddev = 2.938 ms
tps = 528.185674 (including connections establishing)
tps = 528.269291 (excluding connections establishing)
**Output 2**
starting vacuum...end.
progress: 5.0 s, 528.4 tps, lat 11.318 ms stddev 2.940
progress: 10.0 s, 526.0 tps, lat 11.418 ms stddev 2.884
progress: 15.0 s, 522.8 tps, lat 11.473 ms stddev 2.892
progress: 20.0 s, 525.6 tps, lat 11.409 ms stddev 3.008
progress: 25.0 s, 528.0 tps, lat 11.366 ms stddev 2.858
progress: 30.0 s, 525.6 tps, lat 11.412 ms stddev 2.893
progress: 35.0 s, 521.8 tps, lat 11.503 ms stddev 2.973
progress: 40.0 s, 524.4 tps, lat 11.439 ms stddev 2.966
progress: 45.0 s, 736.6 tps, lat 8.152 ms stddev 3.801
progress: 50.0 s, 1101.2 tps, lat 5.447 ms stddev 0.738
progress: 55.0 s, 1012.2 tps, lat 5.929 ms stddev 0.609
progress: 60.0 s, 723.4 tps, lat 8.285 ms stddev 2.969
transaction type:
scaling factor: 2000
query mode: simple
number of clients: 6
number of threads: 6
duration: 60 s
number of transactions actually processed: 38886
latency average = 9.257 ms
latency stddev = 3.629 ms
tps = 647.993705 (including connections establishing)
tps = 648.099359 (excluding connections establishing)
That's a 20% variation in TPS for the exact same configuration!
What am I missing here?
## Hardware setup
- postgres server: 32 GB RAM / 6-core (12 thread) / SSD with RAID1
- pgbench server: 32 GB RAM / 4-core (8 thread) / SSD
## Relevant Postgres config for the above output
max_connection=100
work_mem=4MB
maintenance_work_mem=64MB
shared_buffers=12288MB
temp_buffers=8MB
effective_cache_size=16GB
wal_buffers=-1
wal_sync_method=fsync
max_wal_size=5GB
autovacuum=off
## pgbench settings
**Initialisation**
pgbench
--initialize
--init-steps=dtgpf
--scale=2000 # Results in approx 30-32GB of data
--username=benchmarking
**Benchmarking**
pgbench
--builtin=tpcb-like
--client=6
--jobs=6
--time=60
--progress=5
--username=benchmarking
## Network connectivity between the two servers
# iperf -t 60 -c [IP-ADDRESS]
------------------------------------------------------------
Client connecting to [IP-ADDRESS], TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[ 3] local [IP-ADDRESS] port 40494 connected with [IP-ADDRESS] port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-60.0 sec 3.47 GBytes 496 Mbits/sec
Asked by Saurabh Nanda
(333 rep)
Jan 22, 2019, 07:03 PM
Last activity: Sep 22, 2022, 05:46 AM
Last activity: Sep 22, 2022, 05:46 AM