how do modern data warehouses tackle frequent small writes? esp. when streaming data is one of the sources?

0 votes

2 answers

71 views

                          So for many days, I had a question in mind.

**How do modern data warehouses tackle frequent small writes?** esp. when streaming data is one of the sources?   
e.g. Kafka/Kinesis => DW(Snowflake, Teradata, Oracle ADW, etc)

I was under the impression that since data warehouse tables are typically **highly denormalized and columnar** (for quick performance for reporting queries to avoid joins) they are slow for frequent small writes, but have good performance for reporting style SELECT statements. Hence the concept of **bulk nightly uploads from OLTP data sources to OLAP data warehouses**.

- What has changed in the modern DW internal architecture?  
- Is there a staging area within DW itself, where data lands and then it is aggregated, stats are collected and then denormalized before it finally rests into actual DW tables powering the reporting?  

I am interested in knowing how does it internally works at a high level.

I know this is a basic question, but this is my understanding from my school days hence I am pretty sure it is out of date.

Asked by Libertarian (3 rep)

Oct 23, 2021, 07:53 PM
Last activity: Nov 1, 2021, 12:03 PM

how do modern data warehouses tackle frequent small writes? esp. when streaming data is one of the sources?

Related Questions