MySQL solution for many to many associative table that scale over billion entries
0
votes
5
answers
239
views
### Scenario
Imagine we have a
table
user and an item
user.
These 2 tables have an associative table called user_item
to define a many to many
relationship.
* We start 100 item
records
* We have 500 Millions user
records.
* Therefore we must generate 50_000_000_000 user_item
(50 billions)
* We could potentially have even more
* Won't be easy to shard nor partition because then, it will slow down any other operation (otherwise we need to scan everything)
* Assume as query pattern (INSERT
, SELECT
, UPDATE
) basic/typical m2m patterns (that could be found in any tutorial or example
### Question
What's the best design or known solution for handling billions of Many to Many relationship in a database regardless of a schema?
### Schema
Imagine this simple schema
CREATE DATABASE IF NOT EXISTS playground
CHARACTER SET = latin1;
USE playground;
CREATE TABLE IF NOT EXISTS user
(
id
BIGINT UNSIGNED NOT NULL AUTO_INCREMENT,
name
VARCHAR(255) NOT NULL,
PRIMARY KEY (id
),
INDEX user__name_fk
(name
)
) ENGINE = InnoDB
DEFAULT CHARSET = latin1
ROW_FORMAT = DYNAMIC;
CREATE TABLE IF NOT EXISTS item
(
id
BIGINT UNSIGNED NOT NULL AUTO_INCREMENT,
name
VARCHAR(255) NOT NULL,
PRIMARY KEY (id
),
INDEX user__name
(name
)
) ENGINE = InnoDB
DEFAULT CHARSET = latin1
ROW_FORMAT = DYNAMIC;
CREATE TABLE IF NOT EXISTS user_item
(
user_id
BIGINT UNSIGNED NOT NULL,
item_id
BIGINT UNSIGNED NOT NULL,
PRIMARY KEY (user_id
, item_id
),
INDEX user_item__item
(item_id
),
FOREIGN KEY user_id_fk
(user_id
) REFERENCES user
(id
) ON DELETE CASCADE,
FOREIGN KEY item_id_fk
(item_id
) REFERENCES item
(id
) ON DELETE CASCADE
) ENGINE = InnoDB
DEFAULT CHARSET = latin1
ROW_FORMAT = DYNAMIC;
-- create some default items
INSERT INTO item
(name
) VALUES ('item_1'), ('item_2'), ('item_3'), ('item_4'), ('item_5'), ('item_6'), ('item_7'), ('item_8'), ('item_9'), ('item_10');
-- create some users
INSERT INTO user
(name
) VALUES ('user_1'), ('user_2'), ('user_3'), ('user_4'), ('user_5'), ('user_6'), ('user_7'), ('user_8'), ('user_9'), ('user_10');
INSERT INTO user_item
(user_id
, item_id
) VALUES (1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6), (1, 7), (1, 8), (1, 9), (1, 10);
### More info
I'm not asking on how to use many to many relation ship in MySQL, i know that.
I'm asking what is the **more known** solution for a scaling issue, that is, when number of related records are **exponentially growing** to such big scale.
Also I intentionally didn't add any query pattern (INSERT
, SELECT
, UPDATE
) because **is irrelevant**. Assume the most/typical M2M pattern. I don't want to loose focus on the real question which is about **scaling** and huge amount of data.
There must be some trick or some known workaround right?
I'm also considering a NoSQL database so the answer could include anything non related to MySQL (or any SQL databases),
I feel like this should be a common issue that many big company will face and hence there should be a common (or few) solution.
The root cause of this issue is that, while MySQL is great to create relationship, it will grow **associative m2m table** exponentially.
The 500 Millions x 100 == 50 Billions is just an example.
But could theoretically happen.
#### Clarification
* I left query out in purpose because you can assume to most easy one.
* I'm sure if I gave few example, will start to pop optimization over the specific query, that's not the question
* I'm asking a very high level question, and if there is not a real known solution then a no with explaining why would suffice (assuming is correct)
Here an example of a simple many to many query..
SELECT user.*, item.* FROM user
LEFT JOIN user_item ON user.id = user_item.user_id
LEFT JOIN item ON item.id = user_item.item_id
WHERE user.name = 'user_1';
Similar but not same questions
* https://dba.stackexchange.com/questions/57284/best-design-for-a-manymany-relationship-issue
* https://dba.stackexchange.com/questions/128275/design-a-many-to-many-table-for-scale
* https://dba.stackexchange.com/questions/319353/many-many-to-many-relationships-databse-design-with-mysql
* https://dba.stackexchange.com/questions/331724/better-database-design-for-nested-relation-many-to-many-to-many
Asked by Federico Baù
(151 rep)
Jun 4, 2024, 01:49 PM
Last activity: Aug 1, 2024, 01:15 PM
Last activity: Aug 1, 2024, 01:15 PM