Filter one very large CSV based on values from another CSV
5
votes
2
answers
1954
views
I am processing some CSV files that do not fit in RAM.
The 2 CSV files have the following structure:
**first.csv**
| id | name | timestamp |
|--------|------|---------------------|
| serial | str | yyyy-mm-dd hh:mm:ss |
**second.csv**
| id | name | date |
|--------|------|------------|
| serial | str | yyyy-mm-dd |
The goal is to select rows from
first.csv
that match some criteria compared to second.csv
:
- name
is equal
- timestamp
is in the range of [date
-1, date
+1].
After iterating all these rows the output can be combined into one output file.
Asked by conclv_damian
(51 rep)
Nov 17, 2021, 12:00 AM
Last activity: May 1, 2025, 05:57 PM
Last activity: May 1, 2025, 05:57 PM