Fastest query to process data in small batches without repetition
0
votes
2
answers
144
views
I have java app that is using MySQL in the backend. I have the following table:
A = int, B = varchar, C = timestamp
A | B | C
1 | 100 | 2022-03-01 12:00:00
2 | 200 | 2022-03-01 12:00:01
3 | 100 | 2022-03-01 12:00:01
4 | 200 | 2022-03-01 12:00:02
5 | 600 | 2022-03-01 12:00:03
1 | 100 | 2022-03-01 12:00:06
5 | 700 | 2022-03-01 12:00:07
2 | 200 | 2022-03-01 12:00:08
9 | 100 | 2022-03-01 12:00:08
On every X seconds, query should be run, and it should process 5 records where column C > LAST_PROCESSED_TIMESTAMP
. This LAST_PROCESSED_TIMESTAMP
is updated after each run.
What I need is - I want to select these 5 rows, but not to include the rows if columns A and B are going to repeat in some fetches that are going to happen in the future.
Example: for table above:
First run - select 5
1 | 100 | 2022-03-01 12:00:00 2 | 200 | 2022-03-01 12:00:01 = '2022-03-01 12:00:00'
LIMIT 5
) a
LEFT JOIN (
SELECT A,B
FROM TABLE
WHERE C >= '2022-03-01 12:00:00'
LIMIT 5, 18446744073709551615
) b ON ( a.A=b.A
AND a.B=b.B
)
WHERE b.A IS NULL;
and also (this one is probably NOT OK, since it will select MAX of C even if not in first 5, so for my example, it would include 2 | 200 | 2022-03-01 12:00:08
inside of the first run - not what I need):
SELECT A, B, MAX(C)
FROM TABLE
WHERE C >= '2022-03-01 12:00:00'
GROUP BY A, B ASC
LIMIT 5;
Asked by Bojan Vukasovic
(101 rep)
Mar 7, 2022, 05:48 PM
Last activity: Jul 22, 2025, 10:01 PM
Last activity: Jul 22, 2025, 10:01 PM