PostgreSQL - Bulk COPY slower with unlogged tables than logged tables
0
votes
0
answers
186
views
We have an ETL process that starts with loading data into PostgresSQL 16.2 databases (Azure Database for PostgreSQL flexible). We load 100,000 records at a time using COPY FROM STDIN. This is staging data that doesn't need to be recovered in case of a crash. Because of this, we decided to give unlogged tables a try to reduce costs and increase performance.
My issue is that with all other things being equal, simply adding 'UNLOGGED' to the create table statements increased load times by 30-35% in testing (~21 minutes up to ~28 minutes in one instance, ~31 minutes up to ~41 minutes in another). This doesn't make much sense given what I've read about unlogged tables. Indexes are only created after the data has been loaded into the tables. Overall IOPS to disk and bandwidth consumption are both down when using UNLOGGED. My write throughput (bytes/sec) is much lower with unlogged tables, but I'm not sure if that's a result of having fewer writes in the first place due to not hitting the WAL.
Everything I find online simply talks about "use unlogged tables for faster writes", but there is essentially zero information about what could cause using unlogged tables to result in slower writes. Are there any recommendations for optimizing the performance of bulk loading into unlogged tables? I realize I haven't provided a ton of specifics, but given the lack of information on what can cause this decrease in performance, I don't really know what specific information would be helpful at this point.
Happy to provide any addition information that I can. Any guidance is greatly appreciated!
Asked by Jensenator
(1 rep)
Jun 3, 2024, 01:10 PM