SSD used for tablespace died. How can I recover data?
0
votes
1
answer
287
views
I have a postgres cluster with two databases
- Database A uses the default tablespace. It holds important information but has very few writes per day (< 20) and only has a few tables with a few thousand rows of data.
- Database B is on its own tablespace on a separate SSD. It has hundreds of GBs of data and adds millions of rows per day. The data is for analytics and is not important.
Recently the SSD holding Database B's tablespace died. Postgres will no longer start up. My priority is dumping the data from Database A.
I was thinking that because Database A has few writes or deletes per day, it would be fairly safe to run
pg_resetwal
, then dump Database A. After dumping Database A I would re-install postgres and load the dumped data from Database A.
Is there an alternative method to recover my data. Are there any obvious problems with my plan?
(I know that pg_barman should be used to prevent these problems, but my client has refused when I suggested pg_barman after a similar failure in the past. And, yes, RAID would obviously be better than a single SSD, but I don't get to make the hardware decisions)
Error log when attempting to start postgres:
-03-13 16:20:16.676 PDT LOG: listening on IPv6 address "::1", port 5432
2020-03-13 16:20:16.678 PDT LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
2020-03-13 16:20:16.686 PDT LOG: could not open directory "pg_tblspc/16394/PG_11_201809051": No such file or directory
2020-03-13 16:20:16.688 PDT LOG: database system was interrupted while in recovery at 2020-03-13 16:19:29 PDT
2020-03-13 16:20:16.688 PDT HINT: This probably means that some data is corrupted and you will have to use the last backup for recovery.
2020-03-13 16:20:17.016 PDT LOG: could not stat file "pg_tblspc/16394": No such file or directory
2020-03-13 16:20:17.134 PDT LOG: could not open directory "pg_tblspc/16394/PG_11_201809051": No such file or directory
2020-03-13 16:20:17.135 PDT LOG: database system was not properly shut down; automatic recovery in progress
2020-03-13 16:20:17.135 PDT LOG: could not open directory "pg_tblspc/16394/PG_11_201809051": No such file or directory
2020-03-13 16:20:17.136 PDT LOG: redo starts at 7C1/93EB0EA0
2020-03-13 16:20:17.136 PDT FATAL: could not create directory "pg_tblspc/16394/PG_11_201809051": No such file or directory
2020-03-13 16:20:17.136 PDT CONTEXT: WAL redo at 7C1/93EB0EA0 for Sequence/LOG: rel 16394/26819/26877
2020-03-13 16:20:17.136 PDT LOG: startup process (PID 57190) exited with exit code 1
2020-03-13 16:20:17.136 PDT LOG: aborting startup due to startup process failure
2020-03-13 16:20:17.139 PDT LOG: database system is shut down
Posgresql version is 11.2
Asked by sheepdog
(345 rep)
Mar 14, 2020, 12:30 AM
Last activity: Mar 14, 2020, 02:42 AM
Last activity: Mar 14, 2020, 02:42 AM