Elassandra data modeling
1
vote
0
answers
63
views
In Cassandra, everyone always stresses how important data modeling is, and rightfully so. However, when using Elassandra , we have Elasticsearch baked in. How does should change how we think about modeling our Cassandra tables and partitions?
For example, for vanilla Cassandra we need to try to minimize the amount of searching we do across our partitions (see here under header "Basic Goals" > "Rule 2: Minimize The Number Of Partitions Read". Is this still true when we have our data indexed by Elasticsearch? Is it still worth the extra complexity of duplicating our data and managing duplicate tables for the sake of not searching across partitions?
Another example would be choosing our primary key. In Cassandra, normally we cannot run a CQL query using
WHERE
statements that don't specify all fields from the primary key (unless we use a Cassandra secondary index or override the warnings using ALLOW FILTERING
, see here ). With Elassandra however, this is easily overcome by way of Elasticsearch integration, whether by using the [Elasticsearch REST API or by using CQL with Elasticsearch queries . If we know we will use Elassandra, does that therefore make a difference in how we choose our primary key?
Asked by RyanQuey
(153 rep)
Jun 24, 2020, 01:21 PM
Last activity: Jun 24, 2020, 01:51 PM
Last activity: Jun 24, 2020, 01:51 PM