Postgres server killed by OOM during pg_restore
0
votes
1
answer
72
views
I'm running Postgres 16.3 in a container with a 1 GB memory limit. When I run
pg_restore
with a dump file that's about 1 GB, the server will be killed by OOM and I'm guessing that this is from the auto-vacuum using too much memory.
Here's the logs from one time it happened:
2024-09-19 21:01:58.441 UTC LOG: checkpoint starting: wal
2024-09-19 21:02:14.444 UTC LOG: checkpoint complete: wrote 4643 buffers (28.3%); 0 WAL file(s) added, 0 removed, 33 recycled; write=15.710 s, sync=0.218 s, total=16.003 s; sync files=39, longest=0.077 s, average=0.006 s; distance=537593 kB, estimate=545819 kB; lsn=1D/1138CA88, redo lsn=1C/F303F258
2024-09-19 21:02:15.706 UTC LOG: checkpoints are occurring too frequently (17 seconds apart)
2024-09-19 21:02:15.706 UTC HINT: Consider increasing the configuration parameter "max_wal_size".
2024-09-19 21:02:15.707 UTC LOG: checkpoint starting: wal
2024-09-19 21:02:25.152 UTC LOG: server process (PID 234) was terminated by signal 9: Killed
2024-09-19 21:02:25.152 UTC DETAIL: Failed process was running: COPY public.auth_user (id, password, last_login, is_superuser, username, first_name, last_name, email, is_staff, is_active, date_joined) FROM stdin;
2024-09-19 21:02:25.152 UTC LOG: terminating any other active server processes
2024-09-19 21:02:25.210 UTC LOG: all server processes terminated; reinitializing
2024-09-19 21:02:25.236 UTC LOG: database system was interrupted; last known up at 2024-09-19 21:02:14 UTC
2024-09-19 21:02:25.797 UTC LOG: database system was not properly shut down; automatic recovery in progress
2024-09-19 21:02:25.801 UTC LOG: redo starts at 1C/F303F258
2024-09-19 21:02:34.673 UTC LOG: unexpected pageaddr 1C/D5C24000 in WAL segment 000000010000001D00000021, LSN 1D/21C24000, offset 12730368
2024-09-19 21:02:34.673 UTC LOG: redo done at 1D/21C23FA0 system usage: CPU: user: 5.42 s, system: 3.34 s, elapsed: 8.87 s
2024-09-19 21:02:35.248 UTC LOG: checkpoint starting: end-of-recovery immediate wait
2024-09-19 21:02:35.687 UTC LOG: checkpoint complete: wrote 15986 buffers (97.6%); 0 WAL file(s) added, 0 removed, 46 recycled; write=0.162 s, sync=0.218 s, total=0.441 s; sync files=45, longest=0.210 s, average=0.005 s; distance=765843 kB, estimate=765843 kB; lsn=1D/21C24048, redo lsn=1D/21C24048
2024-09-19 21:02:35.697 UTC LOG: database system is ready to accept connections
Is that what appears to be causing the issue? Is there a way to disable or limit the auto-vacuum during a restore so it will be able to finish without issue?
Asked by Dave Johansen
(121 rep)
Sep 24, 2024, 09:02 PM
Last activity: Sep 26, 2024, 10:26 PM
Last activity: Sep 26, 2024, 10:26 PM