Why would an online columnstore rebuild be much faster than a reorganize on a very fragmented table?

0 votes

0 answers

125 views

sql-server nonclustered-index sql-server-2022 columnstore index-maintenance

                          On a table with lots of deletes spanning over the past six months, I have an **extremely** fragmented non-clustered columnstore index. It is about 500 GB and has 50 partitions. If I either rebuild it online, re-create it with DROP_EXISTING, or just drop and recreate it the hard way, it goes down to 80 GB. When testing this, I've observed the following:

* If I drop the columnstore index and recreate it offline (no DROP_EXISTING, just drop and create), it takes one hour
* If I re-create it with DROP_EXISTING offline, it takes two hours. 
* If I rebuild the columnstore index online, it takes two hours
* If I reorganize the columnstore index without any special settings, I get bored and quit after five hours and see no evidence that the size of the index has decreased.
* If I reorganize the columnstore index with COMPRESS_ALL_ROW_GROUPS = ON, then it completes after four and a half hours. The index does not get smaller. It gets 20 GB **bigger**.

I'm not doing anything smart with the partitions. I'm using a one-liner to hit the entire index. I am on SQL Server 2022, Enterprise Edition. The clustered index is 2 TB rowstore. The biggest column in the table is 8 bytes. The non-clustered index only consists of two columns, one of which is the partitioning column. The biggest of column in the index is an int.

I have seen no evidence that the tuple mover is disabled, so I cannot explain how the fragmentation got this bad. My only guess is that I happen to know that the rows were deleted based on the column that is not the partitioning column, so it could be the case that the alignment of the columnstore index is so terrible that no rowgroups crossed [whatever threshold it is that the background merge task uses](https://learn.microsoft.com/en-us/sql/relational-databases/indexes/reorganize-and-rebuild-indexes?view=sql-server-ver16#reorganize-an-index)  (I am unsure if that merge is part of the tuple mover or not).

I have no idea why REORGANIZE made the columnstore index bigger. [Niko Neugebauer said during the pre-release for SQL Server 2016](https://www.nikoport.com/2015/12/25/columnstore-indexes-part-74-row-groups-merging-cleanup-sql-server-2016-edition/)  that you have to invoke the tuple mover twice to really delete from a non-clustered columnstore index, but I did not think that behaviour even lasted until the real release of 2016 (I'm far past that, I'm on SQL Server 2022).

My question is this: **Why would rebuilding a columnstore index online be dramatically faster than reorganizing it?** Would it have anything to do with the extreme fragmentation? For what it's worth, this same table has me suspicious of SQL Server [bugs](https://dba.stackexchange.com/q/345635/277982)  such as  [this](https://dba.stackexchange.com/q/345953/277982) .

Asked by J. Mini (1237 rep)

Apr 16, 2025, 07:04 PM
Last activity: Apr 19, 2025, 12:27 AM

Why would an online columnstore rebuild be much faster than a reorganize on a very fragmented table?

Related Questions