Sample Header Ad - 728x90

Best practice for applying complex business logic during transformation in ETL from Aurora db to Redshift data warehouse

0 votes
0 answers
115 views
I'll be pushing data from an Amazon Aurora db to a Redshift data warehouse. The painful part is that the transformation portion of the ETL workflow relies *heavily* on business logic, that lives within an application's codebase, in order to be derived into a useful format for the data warehouse. We don't want to remove the business logic from the application layer. My thinking is a couple of different options. Option 1: Using AWS Glue as the ETL framework 1. Extract data 2. Apply business logic on that data by calling an API (via AWS Glue Network Connection) that applies the business logic during transformation 3. Load into Redshift Option 2: 1. Hourly background job in application to extract records that need to be pushed to Redshift 2. Apply business logic on that data in the background job, transform, and stage it in a table on Aurora 3. Use AWS Glue to load from Aurora to Redshift All these tools are new to me, so it's difficult to know which to use. Glue, DMS, Data Pipeline? Any advice?
Asked by mstrom (143 rep)
Feb 4, 2024, 01:14 PM
Last activity: Feb 4, 2024, 01:37 PM