Sample Header Ad - 728x90

Filter one very large CSV based on values from another CSV

5 votes
2 answers
1954 views
I am processing some CSV files that do not fit in RAM. The 2 CSV files have the following structure: **first.csv** | id | name | timestamp | |--------|------|---------------------| | serial | str | yyyy-mm-dd hh:mm:ss | **second.csv** | id | name | date | |--------|------|------------| | serial | str | yyyy-mm-dd | The goal is to select rows from first.csv that match some criteria compared to second.csv: - name is equal - timestamp is in the range of [date-1, date+1]. After iterating all these rows the output can be combined into one output file.
Asked by conclv_damian (51 rep)
Nov 17, 2021, 12:00 AM
Last activity: May 1, 2025, 05:57 PM