Efficient way to store and retrieve big amount of data
1
vote
2
answers
308
views
I need to store a big amount of data (about 2.5 billion of new data rows per month), but also I need a very fast way to retrieve the latest value per group on specific time point. The data looking very simple:
| ParameterId | Value | DateTime |
|--------------|-------|---------------------|
| 1 | 12.5 | 2023-04-21 14:35:03 |
| 2 | 56.81 | 2024-03-01 16:21:17 |
| 1 | 12.5 | 2024-05-22 14:35:03 |
| 1 | 71.4 | 2024-05-31 18:27:03 |
For example, we need the latest values by each parameter on 2024-04-31 17:40. The result will be as follows:
| ParameterId | Value | DateTime |
|--------------|-------|---------------------|
| 1 | 12.5 | 2023-04-21 14:35:03 |
| 2 | 56.81 | 2024-03-01 16:21:17 |
This looks like it can be solved by a simple database storage, but I've some restrictions:
1. The disk storage is limited. That's why indexes cannot be used as it's almost an x2 of the data space
2. The max query time for any request is 5 seconds.
3. We have only 1 server
I've already tried to use TimescaleDB (table partitioning), but because I've only right datetime condition (<= dt) it's very inefficient to search for the value in old chunks from the first one.
Technically it's possible, because we have an old software developed by third-party company 10 years ago and it still working, but nobody knows how...
Asked by the_it_guy
(11 rep)
Jun 21, 2024, 04:56 PM
Last activity: Aug 21, 2024, 04:22 PM
Last activity: Aug 21, 2024, 04:22 PM