Ordering of columns in a GIST index
3
votes
0
answers
432
views
I am modelling an application where events for factories can be registered. There are usually a small amount of factories (around 500), but each factory can register a large set of events (millions). It's impossible that two events overlap for the same factory.
My idea was to introduce an exclusion constraint based on a GIST index:
CREATE TABLE event
(
identifier SERIAL NOT NULL PRIMARY KEY
factory_identifier INTEGER NOT NULL REFERENCES factory,
period TSTZRANGE NOT NULL,
EXCLUDE USING gist
(
factory_identifier WITH =,
period WITH &&
)
);
In the documentation I read:
> A GiST index will be relatively ineffective if its first column has only a few distinct values, even if there are many distinct values in additional columns.
I am not sure how to interpret this. Given that there are only a small amount of factories but a large number of events, should I define the exclusions constraint as following instead?
EXCLUDE USING gist
(
period WITH &&
factory_identifier WITH =,
)
Now, the period
is the first column, for which there are a lot of distinct values (almost all values are unique, actually), in contrast to factory_identifier
, for which there only a few distinct values.
Is this better now? In short, I have difficulties to grasp the quoted documentation above.
----
additional
----
In the slides on a presentation of exclusion constraints on page 16 and a blog post , both coming from the original author of exclusion constraints, I found the following example:
EXCLUDE USING GIST(room WITH =, during WITH &&)
Given the (reasonable) assumption that there much more distinct reservation periods than distinct rooms, I wonder if this should have been (during WITH &&, room WITH =)
instead, given the quote regarding GIST column ordering above.
This makes me believe that I am not really understanding the quoted documented above.
Asked by ItIsJustMe
(33 rep)
Aug 22, 2020, 08:00 AM
Last activity: Aug 22, 2020, 08:27 AM
Last activity: Aug 22, 2020, 08:27 AM