Is a columnstore index appropriate when all columns are (mostly) unique?
0
votes
0
answers
549
views
I have a longish table ( 60M rows ) where there are only 7 columns. One is a unique ID, two are datetimes, and two are notes and descriptions.
There the notes and descriptions are very regular, except for a tag at the end of the text. So, they're technically unique. I can't take that unique tag out as the tags are signatures and these notes and descriptions are legal documents.
If it wasn't for those tags, they'd be 95% from stock descriptions - maybe 15 variations.
These descriptions are up to 8K chars long. I long for some reasonable compression, and am considering a clustered columnstore index to implement that compression, but I'm unclear as to whether the compression will even occur w/ these columns being tagged into uniqueness.
These descriptive columns comprise more than 90% of the row data.
So...rowstore indexing is appropriate for the 'key' column...but I'm wondering if I should define this after the clustered columnstore index.
Current:
create table dbo.SPECIMEN
(
ID int not null,
Specimen_Types varchar(255) null,
Collected_Date datetime null,
Received_Date datetime null,
Results varchar(4000) null,
Notes varchar(max) null,
Lab_Report_ID int not null
)
go
create clustered index [SPECIMEN.ID.Lab_Report_ID.Fake.PrimaryKey]
on dbo.SPECIMEN(ID,Lab_Report_ID);
go
create index [SPECIMEN.Lab_Report_ID.Index]
on dbo.SPECIMEN(Lab_Report_ID);
...and I if I get good compression, I'd change the indexes this way:
go
create clustered columnstore index [SPECIMEN.CCI]
on dbo.SPECIMEN;
go
create index [SPECIMEN.ID.Lab_Report_ID.Fake.PrimaryKey]
on dbo.SPECIMEN(ID,Lab_Report_ID);
go
create index [SPECIMEN.Lab_Report_ID.Index]
on dbo.SPECIMEN(Lab_Report_ID);
Does this make sense? I have very little experience with columnstore indexing and don't want to step on my own foot.
BTW - fake primary key. It is supposed to be unique, but the app that populates the source data occasionally throws in a duplicate. This table is supposed to be an extract from that semi-stable source.
Asked by Clay
(101 rep)
Jan 22, 2022, 03:36 PM