how do modern data warehouses tackle frequent small writes? esp. when streaming data is one of the sources?
0
votes
2
answers
71
views
So for many days, I had a question in mind.
**How do modern data warehouses tackle frequent small writes?** esp. when streaming data is one of the sources?
e.g. Kafka/Kinesis => DW(Snowflake, Teradata, Oracle ADW, etc)
I was under the impression that since data warehouse tables are typically **highly denormalized and columnar** (for quick performance for reporting queries to avoid joins) they are slow for frequent small writes, but have good performance for reporting style
SELECT
statements. Hence the concept of **bulk nightly uploads from OLTP data sources to OLAP data warehouses**.
- What has changed in the modern DW internal architecture?
- Is there a staging area within DW itself, where data lands and then it is aggregated, stats are collected and then denormalized before it finally rests into actual DW tables powering the reporting?
I am interested in knowing how does it internally works at a high level.
I know this is a basic question, but this is my understanding from my school days hence I am pretty sure it is out of date.
Asked by Libertarian
(3 rep)
Oct 23, 2021, 07:53 PM
Last activity: Nov 1, 2021, 12:03 PM
Last activity: Nov 1, 2021, 12:03 PM