Data lake or data warehouse first?
2
votes
1
answer
899
views
I have been confused whether to create a data lake or a data warehouse and hope some experienced real-world professional can give me some enlightenment.
I will like to store, visualise and perform machine learning with the data that I ingested from multiple sources (IoT devices, APIs etc.). I read that a business will require *both* data lake and warehouse in the current environment that we are in.
My question is:
1. should I create a data lake first, then transform/process these raw data from the lake and ingest it into a data warehouse?
2. Or is the data lake a separate data processing pipeline on its own?
3. Or is this depends on the use case?
This has been what I been thinking of:
PS: If this is the wrong StackExchange do let me know thanks :)

Asked by SunnyBoiz
(153 rep)
May 13, 2022, 12:30 PM
Last activity: May 13, 2022, 12:51 PM
Last activity: May 13, 2022, 12:51 PM