Sample Header Ad - 728x90

Does any MariaDB engine have the option to declaratively or configuratively abstract highly-redundant data?

0 votes
1 answer
94 views
I have an application which is architected in a "NoSQL" style around one fully denormalized "main table" which currently just holds a primary key and one JSON-valued column. For reasons which are outside the scope of this question, I want to retain this architecture: I do NOT want to go full relational and create a proper normalized data model with entities, foreign keys, etc. I would be willing to parse out the JSON fields into their own columns, if that helps with the question below. The data has a very high level of redundancy, e.g. some long-text column values such as names, addresses, and descriptions may be repeated 20 or 30+ times across different rows. **Is there a MariaDB engine which can declaratively or configuratively deal with this type of redundancy by internally de-duplicating the values across rows?** I imagine different ways this might be implemented in MariaDB would involve an extension to the BLOB or TEXT data types to leverage a content-based hash along with reference-counting or garbage collection. What is meant by declaratively: it can be accomplished via a single "ALTER TABLE" statement. What is meant by configuratively: it can be accomplished by modifying one or more system variables. I have read about ColumnStore, TokuDB, Mroonga, Parquet, and MyRocks but I'm not quite sure this is what they do, as the documentation is very sparse. This question is specifically about a storage engine to help internally optimize the storage of redundant data. Please refrain from telling me to rearchitect, redesign, or refactor the application.
Asked by Alex R (207 rep)
Mar 2, 2019, 08:24 PM
Last activity: Mar 9, 2019, 03:46 AM