I had a PostgreSQL database with loads of scanned documents, as a
document bytea
column in the table scans
, with hundreds of thousands of documents that was large and inconvenient to backup. Also, there was a high rate of duplication, not major but 5-10%.
I wanted to store those documents outside of the database so they could be backed-up via incremental tar, and reduce the size of the pg_dump database backup.
I came up with a solution I want to share below, using plperlu. Any comments or further optimisation ideas will be appreciated!
Asked by Ezequiel Tolnay
(5028 rep)
Apr 7, 2016, 02:44 AM
Last activity: May 25, 2017, 04:05 AM
Last activity: May 25, 2017, 04:05 AM