Why doesn't an OFFSET query benefit more from the index?
5
votes
2
answers
2857
views
I've got a simple example
books
table with an integer id
primary key ("books_pkey" PRIMARY KEY, btree (id)
) and 100,000,000 random rows. If I run:
EXPLAIN
SELECT *
FROM books
ORDER BY id
OFFSET 99999999
LIMIT 1;
I see a query plan like:
Limit (cost=3137296.54..3137296.57 rows=1 width=14)
-> Index Scan using books_pkey on books (cost=0.57..3137296.57 rows=100000000 width=14)
Do I understand correctly that PostgreSQL is loading 100000000
rows into memory, only for the OFFSET
to discard all but 1
? If so, why can't it do the "load and discard" step using the index and only load one row into memory?
I understand that the typical solution to this is to use keyset pagination - to say WHERE id > x
. I'm just trying to understand why an index alone doesn't solve this. Adding another index which is explicitly sorted the same way as this query (CREATE INDEX books_id_ordered ON books (id ASC)
) makes no difference to EXPLAIN
.
Asked by Nathan Long
(1005 rep)
Aug 13, 2021, 05:54 PM
Last activity: Aug 16, 2021, 07:22 PM
Last activity: Aug 16, 2021, 07:22 PM