Sample Header Ad - 728x90

Why doesn't an OFFSET query benefit more from the index?

5 votes
2 answers
2857 views
I've got a simple example books table with an integer id primary key ("books_pkey" PRIMARY KEY, btree (id)) and 100,000,000 random rows. If I run:
EXPLAIN
SELECT *
FROM books
ORDER BY id
OFFSET 99999999
LIMIT 1;
I see a query plan like:
Limit  (cost=3137296.54..3137296.57 rows=1 width=14)
   ->  Index Scan using books_pkey on books  (cost=0.57..3137296.57 rows=100000000 width=14)
Do I understand correctly that PostgreSQL is loading 100000000 rows into memory, only for the OFFSET to discard all but 1? If so, why can't it do the "load and discard" step using the index and only load one row into memory? I understand that the typical solution to this is to use keyset pagination - to say WHERE id > x. I'm just trying to understand why an index alone doesn't solve this. Adding another index which is explicitly sorted the same way as this query (CREATE INDEX books_id_ordered ON books (id ASC)) makes no difference to EXPLAIN.
Asked by Nathan Long (1005 rep)
Aug 13, 2021, 05:54 PM
Last activity: Aug 16, 2021, 07:22 PM