How to deal with large offsets in select?
5
votes
2
answers
8683
views
Table
jtest
with 200k rows, each row contains jsonb { id: "", key: }
(` is integer 1-200k incremented per each row). There is also btree index on
data->'key'`.
create extension if not exists pgcrypto;
create table jtest (data jsonb not null default '{}');
insert into jtest (data)
select json_build_object('id', gen_random_uuid(), 'key', i)::jsonb
FROM generate_series(1,200000) i;
create index jtest_key on jtest ((data->'key'));
First query (fast):
EXPLAIN ANALYZE
select j.data
from jtest j
order by j.data->'key'
limit 20;
-- "Limit (cost=0.42..1.43 rows=20 width=74) (actual time=0.023..0.044 rows=20 loops=1)"
-- " -> Index Scan using jtest_key on jtest j (cost=0.42..10150.42 rows=200000 width=74) (actual time=0.023..0.039 rows=20 loops=1)"
-- "Planning time: 0.082 ms"
-- "Execution time: 0.066 ms"
Second query with large offset (slow):
EXPLAIN ANALYZE
select j.data
from jtest j
order by j.data->'key'
offset 100000
limit 20;
-- "Limit (cost=5075.42..5076.44 rows=20 width=74) (actual time=112.861..112.896 rows=20 loops=1)"
-- " -> Index Scan using jtest_key on jtest j (cost=0.42..10150.42 rows=200000 width=74) (actual time=0.027..106.378 rows=100020 loops=1)"
-- "Planning time: 0.105 ms"
-- "Execution time: 112.931 ms"
In PG docs I found:
> The rows skipped by an OFFSET clause still have to be computed inside the server; therefore a large OFFSET might be inefficient.
But they don't say anything about how to deal with large offsets. How I can improve this query? Also is this behaviour (inefficiency for large offsets) common to all RDMS or just PG (using 9.4)?
Asked by user606521
(1415 rep)
Aug 23, 2015, 06:48 PM
Last activity: Mar 4, 2025, 06:09 AM
Last activity: Mar 4, 2025, 06:09 AM