Sample Header Ad - 728x90

Fastest query to process data in small batches without repetition

0 votes
2 answers
144 views
I have java app that is using MySQL in the backend. I have the following table: A = int, B = varchar, C = timestamp A | B | C 1 | 100 | 2022-03-01 12:00:00 2 | 200 | 2022-03-01 12:00:01 3 | 100 | 2022-03-01 12:00:01 4 | 200 | 2022-03-01 12:00:02 5 | 600 | 2022-03-01 12:00:03 1 | 100 | 2022-03-01 12:00:06 5 | 700 | 2022-03-01 12:00:07 2 | 200 | 2022-03-01 12:00:08 9 | 100 | 2022-03-01 12:00:08 On every X seconds, query should be run, and it should process 5 records where column C > LAST_PROCESSED_TIMESTAMP. This LAST_PROCESSED_TIMESTAMP is updated after each run. What I need is - I want to select these 5 rows, but not to include the rows if columns A and B are going to repeat in some fetches that are going to happen in the future. Example: for table above: First run - select 5 1 | 100 | 2022-03-01 12:00:00 2 | 200 | 2022-03-01 12:00:01 = '2022-03-01 12:00:00' LIMIT 5 ) a LEFT JOIN ( SELECT A,B FROM TABLE WHERE C >= '2022-03-01 12:00:00' LIMIT 5, 18446744073709551615 ) b ON ( a.A=b.A AND a.B=b.B ) WHERE b.A IS NULL; and also (this one is probably NOT OK, since it will select MAX of C even if not in first 5, so for my example, it would include 2 | 200 | 2022-03-01 12:00:08 inside of the first run - not what I need): SELECT A, B, MAX(C) FROM TABLE WHERE C >= '2022-03-01 12:00:00' GROUP BY A, B ASC LIMIT 5;
Asked by Bojan Vukasovic (101 rep)
Mar 7, 2022, 05:48 PM
Last activity: Jul 22, 2025, 10:01 PM