Sample Header Ad - 728x90

Sharding AWS MySQL data

0 votes
1 answer
214 views
I am working on a SaaS application using AWS MySQL. The application is for multiple organizations. We use the same database and tables to store multiple org data. We store org_id in most of the tables and use this in select, update and delete queries as where org_id = ? . ### DB Model * **Organization**
org_id
* **User**
user_id
  org_id
* **client**
client_id
  org_id
* **client_contact**
client_id
  phone_number
  phone_type
* **project**
project_id
  org_id
* **user_project**
project_id
  user_id
The data is growing rapidly and we need to shard the data. We have customers from all over the world. If I use a different sharding id for different table, then a join might happen across the nodes. For example, for user, sharding can be on shard_id and for client, sharding can be on client_id and for project, sharding can be on project_id. Here, the problem would be joining project and user. The user_id and project_id might be on different nodes and join will happen across nodes. What is the best approach here? I am thinking of sharding based on org_id as I store org_id in most of the tables. I see two problems here: 1. A few child tables don't store org_id as the parent table is storing. Do I need to store the org_id in all the tables? 2. Some orgs might have more data and load which might lead to a hot spot and more storage on a particular node. Is it possible scale a particular node alone with AWS RDS? Please suggest the best approach. **Note** Each org might have up to 500,000 records in any table with around 10 columns. We could reach more than 5,000 organizations in 6 months. 500,000 multiplied by 5,000 orgs will reach 2.5 billion records.
Asked by user1578872 (79 rep)
May 16, 2021, 03:37 PM
Last activity: Jun 21, 2025, 05:04 AM