Sharding AWS MySQL data

0 votes

1 answer

214 views

I am working on a SaaS application using AWS MySQL. The application is for multiple organizations. We use the same database and tables to store multiple org data. We store org_id in most of the tables and use this in select, update and delete queries as where org_id = ? . ### DB Model * **Organization**

org_id

* **User**

user_id
  org_id

* **client**

client_id
  org_id

* **client_contact**

client_id
  phone_number
  phone_type

* **project**

project_id
  org_id

* **user_project**

project_id
  user_id

The data is growing rapidly and we need to shard the data. We have customers from all over the world. If I use a different sharding id for different table, then a join might happen across the nodes. For example, for user, sharding can be on shard_id and for client, sharding can be on client_id and for project, sharding can be on project_id. Here, the problem would be joining project and user. The user_id and project_id might be on different nodes and join will happen across nodes. What is the best approach here? I am thinking of sharding based on org_id as I store org_id in most of the tables. I see two problems here: 1. A few child tables don't store org_id as the parent table is storing. Do I need to store the org_id in all the tables? 2. Some orgs might have more data and load which might lead to a hot spot and more storage on a particular node. Is it possible scale a particular node alone with AWS RDS? Please suggest the best approach. **Note** Each org might have up to 500,000 records in any table with around 10 columns. We could reach more than 5,000 organizations in 6 months. 500,000 multiplied by 5,000 orgs will reach 2.5 billion records.

Asked by user1578872 (79 rep)

May 16, 2021, 03:37 PM
Last activity: Jun 21, 2025, 05:04 AM

Sharding AWS MySQL data

Related Questions