Sample Header Ad - 728x90

performance overhead of writing mostly NULL values in clustered columnstore index

2 votes
1 answer
72 views
We have a columnstore table > 2billion records. and we want to add 3 new columns that are very sparse. Maybe 0.01% of records will have these fields populated (all int fields). It already has 75 columns. We insert/update into this table about 20 million records per day. I understand the storage is not an issue bc it will efficiently compress this data while taking up little space. My main concern is writing to this table... it's already wide and I think adding more fields will impact Write performance. Am I correct in this assessment - it still has to write to delta store and compress. The other approach is to create a new rowstore table for these fields that are seldomly populated (and used) and just join between the two when needed. --- I partially regret that we have 75 columns. Part of it relates to columns that aren't used much. We use change tracking to load from a different system hourly. Sometimes batches are 2 million, sometimes batches are 100k, etc. I guess the main question is, if we add a new column to a column store table that is usually 100% populated and that takes 1 additional minute, would adding a new column that is usually only .01% populated also take 1 additional minute or does it take much less?
Asked by Gabe (1396 rep)
Mar 7, 2025, 02:55 PM
Last activity: Mar 11, 2025, 08:59 AM