Sample Header Ad - 728x90

MySQL solution for many to many associative table that scale over billion entries

0 votes
5 answers
239 views
### Scenario Imagine we have a table user and an item user. These 2 tables have an associative table called user_item to define a many to manyrelationship. * We start 100 item records * We have 500 Millions user records. * Therefore we must generate 50_000_000_000 user_item (50 billions) * We could potentially have even more * Won't be easy to shard nor partition because then, it will slow down any other operation (otherwise we need to scan everything) * Assume as query pattern (INSERT, SELECT, UPDATE) basic/typical m2m patterns (that could be found in any tutorial or example ### Question What's the best design or known solution for handling billions of Many to Many relationship in a database regardless of a schema? ### Schema Imagine this simple schema
CREATE DATABASE IF NOT EXISTS playground CHARACTER SET = latin1;
USE playground;

CREATE TABLE IF NOT EXISTS user
(
    id   BIGINT UNSIGNED NOT NULL AUTO_INCREMENT,
    name VARCHAR(255)    NOT NULL,
    PRIMARY KEY (id),
    INDEX user__name_fk (name)
) ENGINE = InnoDB
  DEFAULT CHARSET = latin1
  ROW_FORMAT = DYNAMIC;

CREATE TABLE IF NOT EXISTS item
(
    id   BIGINT UNSIGNED NOT NULL AUTO_INCREMENT,
    name VARCHAR(255)    NOT NULL,
    PRIMARY KEY (id),
    INDEX user__name (name)
) ENGINE = InnoDB
  DEFAULT CHARSET = latin1
  ROW_FORMAT = DYNAMIC;

CREATE TABLE IF NOT EXISTS user_item
(
    user_id BIGINT UNSIGNED NOT NULL,
    item_id BIGINT UNSIGNED NOT NULL,
    PRIMARY KEY (user_id, item_id),
    INDEX user_item__item (item_id),

    FOREIGN KEY user_id_fk (user_id) REFERENCES user (id) ON DELETE CASCADE,
    FOREIGN KEY item_id_fk (item_id) REFERENCES item (id) ON DELETE CASCADE
) ENGINE = InnoDB
  DEFAULT CHARSET = latin1
  ROW_FORMAT = DYNAMIC;

-- create some default items
INSERT INTO item (name) VALUES ('item_1'), ('item_2'), ('item_3'), ('item_4'), ('item_5'), ('item_6'), ('item_7'), ('item_8'), ('item_9'), ('item_10');
-- create some users
INSERT INTO user (name) VALUES ('user_1'), ('user_2'), ('user_3'), ('user_4'), ('user_5'), ('user_6'), ('user_7'), ('user_8'), ('user_9'), ('user_10');
INSERT INTO user_item (user_id, item_id) VALUES (1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6), (1, 7), (1, 8), (1, 9), (1, 10);
### More info I'm not asking on how to use many to many relation ship in MySQL, i know that. I'm asking what is the **more known** solution for a scaling issue, that is, when number of related records are **exponentially growing** to such big scale. Also I intentionally didn't add any query pattern (INSERT, SELECT, UPDATE) because **is irrelevant**. Assume the most/typical M2M pattern. I don't want to loose focus on the real question which is about **scaling** and huge amount of data. There must be some trick or some known workaround right? I'm also considering a NoSQL database so the answer could include anything non related to MySQL (or any SQL databases), I feel like this should be a common issue that many big company will face and hence there should be a common (or few) solution. The root cause of this issue is that, while MySQL is great to create relationship, it will grow **associative m2m table** exponentially. The 500 Millions x 100 == 50 Billions is just an example. But could theoretically happen. #### Clarification * I left query out in purpose because you can assume to most easy one. * I'm sure if I gave few example, will start to pop optimization over the specific query, that's not the question * I'm asking a very high level question, and if there is not a real known solution then a no with explaining why would suffice (assuming is correct) Here an example of a simple many to many query..
SELECT user.*, item.* FROM user
LEFT JOIN user_item ON user.id = user_item.user_id
LEFT JOIN item ON item.id = user_item.item_id
WHERE user.name = 'user_1';
Similar but not same questions * https://dba.stackexchange.com/questions/57284/best-design-for-a-manymany-relationship-issue * https://dba.stackexchange.com/questions/128275/design-a-many-to-many-table-for-scale * https://dba.stackexchange.com/questions/319353/many-many-to-many-relationships-databse-design-with-mysql * https://dba.stackexchange.com/questions/331724/better-database-design-for-nested-relation-many-to-many-to-many
Asked by Federico Baù (151 rep)
Jun 4, 2024, 01:49 PM
Last activity: Aug 1, 2024, 01:15 PM