Sample Header Ad - 728x90

Airflow to BigQueryt data load taking forever

0 votes
1 answer
150 views
Im currently working as a junior data engineer. My main job right now is to move data from a DB in MYSQL (which gets updated every few minutes via webhooks) and send it to BigQuery as frequently as posible using Airflow, as this is our main DB for later analyzing data with power BI. The problem is that the bigger tables (which only have ~ 1000 rows) take about 2 hours to load to BQ, and thus making this impossible to scale, I can´t imagine what will happen in the future when only the deltas are 10000 rows each... This works using pandas and SQLAlchemy by extracting data as a dataframe and using "to_sql" method passing all the BQ connection parameters. I am already uploading only incrementals/delta, that is not the problem. Do you have any advice? Is Airflow the right tool for this? I´ve been searching for solutions for weeks but couldn´t find anything.
Asked by Ayrton (1 rep)
Aug 28, 2022, 11:11 PM
Last activity: Jul 17, 2025, 11:03 AM