Performant way to limit selected rows but make sure the sum of the rows is large enough
0
votes
0
answers
81
views
Here's a simplified version of the problem. You have a marketplace where users can publish how many items they have for sale and the price. You would like to buy 1 000 000 items, and you want to purchase the cheapest listings first.
Example of getting all rows:
SELECT id, price, count FROM for_sale ORDER BY price ASC
In a worst-case scenario, you might need to fetch 1 000 000 rows if each user is selling one item each, but usually, users would publish more than a single item.
Is there a performant way to get only the rows needed to fulfil the purchase? E.g., for each row, a counter named
total_items
is incremented with count
, and when total_items
covers the wanted purchase amount, no more rows are returned.
Using LIMIT
reduces the number of rows returned, but I can't be sure how many rows are needed as it depends on each row's count.
## What I've tried so far
SET @total := 0;
SELECT id, count, price, @total := @total + count AS total
FROM for_sale
WHERE @total < 1000000 ORDER BY price
ASC;
This query somewhat works, but the user variable check seems to be done client-side resulting in up to seven additional rows where the total is over 1 000 000. The performance is also terrible compared to just a LIMIT
(more than 300x execution time).
I've also tried to use a window function, but I'm not sure if that's possible without reading the whole table.
Is there a way to only select rows that covers the purchase amount instead of every row? I've searched a lot, and the only examples I've found is user variable ones that suffer from performance issues.
Asked by Alex
(53 rep)
Jun 30, 2021, 11:19 AM
Last activity: Jun 30, 2021, 03:54 PM
Last activity: Jun 30, 2021, 03:54 PM