JSONB vs JSON for column-oriented data in PostgreSQL

2 votes

1 answer

217 views

I am processing data tables with varying numbers of columns and rows, represented by JSON documents with one array per column. The format of a document is

{
    "column_1": ["value_1_1", "value_1_2", ..., "value_1_n"],
    "column_2": ["value_2_1", "value_2_2", ..., "value_2_n"],
    ...,
    "column_m": ["value_m_1", "value_m_2", ..., "value_m_n"],
}

The number of columns m is typically in the lower tens, while the number of values n lies in the lower millions. Values are either small integers or short strings, and individual data tables stored as text files are 100-200 MB in size. Documents are stored in a jsonb column in PostgreSQL:

Table "data"
             ************

    Column    |  Type   | Nullable 
--------------+---------+----------
 dat_id       | integer | not null
 dat_document | jsonb   | not null

Documents are typically served "as is" to an application or with a simple filter, e.g.,

select col2, col7, col9
from (
        select jsonb_array_elements(dat_document->'column_1') as col1,
               jsonb_array_elements(dat_document->'column_2') as col2,
               jsonb_array_elements(dat_document->'column_5') as col7,
               jsonb_array_elements(dat_document->'column_9') as col9,
        from data
        where dat_id = 20
    ) as sub
where sub.col1::text like '%@yahoo.com';

The documents are stored as jsonb following recommendations from the [PostgreSQL documentation](https://www.postgresql.org/docs/current/datatype-json.html) , however, when using the simpler json type instead in the table *data*, I do not notice a significant drop in execution time for simple queries as the one above. On the other hand, my documents seem to take roughly 50% more disk space according to pg_column_size on the same document as jsonb vs json. Is there any advantage of storing my documents as jsonb instead of json in this case?

Asked by monomeric (21 rep)

May 16, 2025, 01:09 PM
Last activity: May 19, 2025, 09:33 AM

JSONB vs JSON for column-oriented data in PostgreSQL

Related Questions