Sample Header Ad - 728x90

Is a columnstore index appropriate when all columns are (mostly) unique?

0 votes
0 answers
549 views
I have a longish table ( 60M rows ) where there are only 7 columns. One is a unique ID, two are datetimes, and two are notes and descriptions. There the notes and descriptions are very regular, except for a tag at the end of the text. So, they're technically unique. I can't take that unique tag out as the tags are signatures and these notes and descriptions are legal documents. If it wasn't for those tags, they'd be 95% from stock descriptions - maybe 15 variations. These descriptions are up to 8K chars long. I long for some reasonable compression, and am considering a clustered columnstore index to implement that compression, but I'm unclear as to whether the compression will even occur w/ these columns being tagged into uniqueness. These descriptive columns comprise more than 90% of the row data. So...rowstore indexing is appropriate for the 'key' column...but I'm wondering if I should define this after the clustered columnstore index. Current: create table dbo.SPECIMEN ( ID int not null, Specimen_Types varchar(255) null, Collected_Date datetime null, Received_Date datetime null, Results varchar(4000) null, Notes varchar(max) null, Lab_Report_ID int not null ) go create clustered index [SPECIMEN.ID.Lab_Report_ID.Fake.PrimaryKey] on dbo.SPECIMEN(ID,Lab_Report_ID); go create index [SPECIMEN.Lab_Report_ID.Index] on dbo.SPECIMEN(Lab_Report_ID); ...and I if I get good compression, I'd change the indexes this way: go create clustered columnstore index [SPECIMEN.CCI] on dbo.SPECIMEN; go create index [SPECIMEN.ID.Lab_Report_ID.Fake.PrimaryKey] on dbo.SPECIMEN(ID,Lab_Report_ID); go create index [SPECIMEN.Lab_Report_ID.Index] on dbo.SPECIMEN(Lab_Report_ID); Does this make sense? I have very little experience with columnstore indexing and don't want to step on my own foot. BTW - fake primary key. It is supposed to be unique, but the app that populates the source data occasionally throws in a duplicate. This table is supposed to be an extract from that semi-stable source.
Asked by Clay (101 rep)
Jan 22, 2022, 03:36 PM