Database Administrators

Q&A for database professionals who wish to improve their database skills

Latest Questions

2 votes

1 answers

1417 views

How to properly use moveChunk when chunks of certain range needs to be moved?

mongodb sharding mongodb-3.6 mongorestore

A MongoDB 3.6.3 database has two shards and a mongodump file with partial data of one collection needs to be restored. The first shard is named "fast" while second "slow". The idea is to restore the dump to the "slow" shard. According to sharding rules the data should go to the "slow" shard, but it...

                                  A MongoDB 3.6.3 database has two shards and a mongodump file with partial data of one collection needs to be restored. The first shard is named "fast" while second "slow". The idea is to restore the dump to the "slow" shard. According to sharding rules the data should go to the "slow" shard, but it actually goes to the wrong one when restore is tried.

Before restoring the data, I want to manually move a range of chunks from fast to slow shard, but are unable to properly issue the command. All examples found are showing of moving only one exact chunk. _id is used as a sharding key.

Try 1:

    use admin
    db.runCommand({ moveChunk: "db.use", bounds : [ {_id : ObjectId("58b60e73e5d4e7019aa2be17")}, {_id : ObjectId("58bca60f5067031c77b03807")} ], to: "rs1" })

This is the response:

    {
            "ok" : 0,
            "errmsg" : "no chunk found with the shard key bounds [{ _id: ObjectId('58b60e73e5d4e7019aa2be17') }, { _id: ObjectId('58bca60f5067031c77b03807') })",
            "$clusterTime" : {
                    "clusterTime" : Timestamp(1523459407, 14673),
                    "signature" : {
                            "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                            "keyId" : NumberLong(0)
                    }
            },
            "operationTime" : Timestamp(1523459407, 14673)
    }

Try 2:

    sh.moveChunk("db.use", {_id:{$gt: ObjectId("58b60e73e5d4e7019aa2be17"), $lt: ObjectId("58bca60f5067031c77b03807") }},"rs1")

Response:

    {
            "ok" : 0,
            "errmsg" : "no shard key found in chunk query { _id: { $gt: ObjectId('58b60e73e5d4e7019aa2be17'), $lt: ObjectId('58bca60f5067031c77b03807') } }",
            "$clusterTime" : {
                    "clusterTime" : Timestamp(1523460271, 11742),
                    "signature" : {
                            "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                            "keyId" : NumberLong(0)
                    }
            },
            "operationTime" : Timestamp(1523460271, 11742)
    }

Any idea how to move chunks that belong to certain shard key range?




                                

ssasa (261 rep)

Apr 11, 2018, 03:30 PM • Last activity: Aug 1, 2025, 10:06 AM

1 votes

1 answers

389 views

Sharded Mongodb stalls randomly

mongodb sharding azure

I have setup Sharded MongoDB cluster using hashed sharding in kuberenetes.I first created the config server Replicaset and then created 2 shard replicasets. Finally created mongos to connect to the sharded cluster. I followed the below link to setup sharded MongoDB Click https://docs.mongodb.com/manual/tutorial/deploy-sharded-cluster-hashed-sharding/ After creation of mongos,I have enabled sharding for the database and have sharded the collection using the hashed sharding strategy. After all this setup,I'm able to connect to mongos and have added some data to some of the collections in the database and able to check the distribution of data across different shards. The issue that I'm facing is when trying to access mongodb from my java spring boot project,the connection stalls randomly.But once the connection is established for a particular query, that particular query won't stall for next few tries.After some idle time if I try to make request again to mongodb,it will again start to stall. Note : MongoDB is hosted in "DS2 v2" VM and this cluster has 4 nodes.1 for config server,2 for shards and 1 for mongos In one of the link,they had asked to set proper shard key to all the collections and this will have an impact on the performance of the mongodb.There were couple of things to consider before selecting the right shard key,I had considered all those factors before selecting shard key.I read through this link to select shard key - Click https://www.mongodb.com/blog/post/on-selecting-a-shard-key-for-mongodb One of the other solution that I came across was that to set the ShardingTaskExecutorPoolMaxConnecting and to limit the rate at which mongos nodes add connectons to connection pools.I tried setting it to 20,5,100,150 and none of this resolved the stalling issue that I'm facing. This is the link - Click https://jira.mongodb.org/browse/SERVER-29237 I tried tweaking other parameters like ShardingTaskExecutorPoolMinSize and taskExecutorPoolSize.Even this did not resolve stalling issue. I also set --serviceExecutor as adaptive. Increased the wiredTigerCacheSizeGB from 0.25 to 2.This also dint make any difference to the stalling issue 1) YAML file of service and Deployment for config server of mongodb is -

apiVersion: v1
items:
- apiVersion: v1
  kind: Service
  metadata:
    annotations:
      kompose.cmd: kompose convert -d -f docker-compose.yml -o azure-deployment.yaml
      kompose.version: 1.12.0 (0ab07be)
    creationTimestamp: null
    labels:
      io.kompose.service: mongo-conf-service
    name: mongo-conf-service
  spec:
    type: LoadBalancer
    ports:
    - name: "27017"
      port: 27017
      targetPort: 27017
    selector:
      io.kompose.service: mongo-conf-service
  status:
    loadBalancer: {}
- apiVersion: extensions/v1beta1
  kind: Deployment
  metadata:
    annotations:
      kompose.cmd: kompose convert -d -f docker-compose.yml -o azure-deployment.yaml
      kompose.version: 1.12.0 (0ab07be)
    creationTimestamp: null
    labels:
      io.kompose.service: mongo-conf-service
    name: mongo-conf-service
  spec:
    replicas: 1
    strategy: {}
    template:
      metadata:
        creationTimestamp: null
        labels:
          io.kompose.service: mongo-conf-service
      spec:
        containers:
        - env:
          - name: MONGO_INITDB_ROOT_USERNAME
            value: #Username
          - name: MONGO_INITDB_ROOT_PASSWORD
            value: #Password
          command:
          - "mongod"
          - "--storageEngine"
          - "wiredTiger"
          - "--port"
          - "27017"
          - "--bind_ip"
          - "0.0.0.0"
          - "--wiredTigerCacheSizeGB"
          - "2"
          - "--configsvr"
          - "--replSet"
          - "ConfigDBRepSet"
          image: #MongoImageName
          name: mongo-conf-service
          ports:
          - containerPort: 27017
          resources: {}
          volumeMounts:
          - name: mongo-conf
            mountPath: /data/db
        restartPolicy: Always
        volumes:
          - name: mongo-conf
            persistentVolumeClaim:
              claimName: mongo-conf

2) YAML file of service and Deployment for Shard mongodb is -

apiVersion: v1
items:
- apiVersion: v1
  kind: Service
  metadata:
    annotations:
      kompose.cmd: kompose convert -d -f docker-compose.yml -o azure-deployment.yaml
      kompose.version: 1.12.0 (0ab07be)
    creationTimestamp: null
    labels:
      io.kompose.service: mongo-shard
    name: mongo-shard
  spec:
    type: LoadBalancer
    ports:
    - name: "27017"
      port: 27017
      targetPort: 27017
    selector:
      io.kompose.service: mongo-shard
  status:
    loadBalancer: {}
- apiVersion: extensions/v1beta1
  kind: Deployment
  metadata:
    annotations:
      kompose.cmd: kompose convert -d -f docker-compose.yml -o azure-deployment.yaml
      kompose.version: 1.12.0 (0ab07be)
    creationTimestamp: null
    labels:
      io.kompose.service: mongo-shard
    name: mongo-shard
  spec:
    replicas: 1
    strategy: {}
    template:
      metadata:
        creationTimestamp: null
        labels:
          io.kompose.service: mongo-shard
      spec:
        containers:
        - env:
          - name: MONGO_INITDB_ROOT_USERNAME
            value: #Username
          - name: MONGO_INITDB_ROOT_PASSWORD
            value: #Password
          command:
          - "mongod"
          - "--storageEngine"
          - "wiredTiger"
          - "--port"
          - "27017"
          - "--bind_ip"
          - "0.0.0.0"
          - "--wiredTigerCacheSizeGB"
          - "2"
          - "--shardsvr"
          - "--replSet"
          - "Shard1RepSet"
          image: #MongoImage
          name: mongo-shard
          ports:
          - containerPort: 27017
          resources: {}

3) YAML File of mongos server:

apiVersion: v1
items:
- apiVersion: v1
  kind: Service
  metadata:
    annotations:
      kompose.cmd: kompose convert -d -f docker-compose.yml -o azure-deployment.yaml
      kompose.version: 1.12.0 (0ab07be)
    creationTimestamp: null
    labels:
      io.kompose.service: mongos-service
    name: mongos-service
  spec:
    type: LoadBalancer
    ports:
    - name: "27017"
      port: 27017
      targetPort: 27017
    selector:
      io.kompose.service: mongos-service
  status:
    loadBalancer: {}
- apiVersion: extensions/v1beta1
  kind: Deployment
  metadata:
    annotations:
      kompose.cmd: kompose convert -d -f docker-compose.yml -o azure-deployment.yaml
      kompose.version: 1.12.0 (0ab07be)
    creationTimestamp: null
    labels:
      io.kompose.service: mongos-service
    name: mongos-service
  spec:
    replicas: 1
    strategy: {}
    template:
      metadata:
        creationTimestamp: null
        labels:
          io.kompose.service: mongos-service
      spec:
        containers:
        - env:
          - name: MONGO_INITDB_ROOT_USERNAME
            value: #USername
          - name: MONGO_INITDB_ROOT_PASSWORD
            value: #Password
          command:
            - "numactl"
            - "--interleave=all"
            - "mongos"
            - "--port"
            - "27017"
            - "--bind_ip"
            - "0.0.0.0"
            - "--configdb"
            - "ConfigDBRepSet/mongo-conf-service:27017"
          image: #MongoImageName
          name: mongos-service
          ports:
          - containerPort: 27017
          resources: {}

The logs of mongos server is :

2019-08-05T05:27:52.942+0000 I NETWORK  [listener] connection accepted from 10.0.0.0:5058 #308807 (79 connections now open)
2019-08-05T05:27:52.964+0000 I ACCESS   [conn308807] Successfully authenticated as principal Assist_Random_Workspace on Random_Workspace from client 10.0.0.0:5058
2019-08-05T05:27:54.267+0000 I NETWORK  [worker-3] end connection 10.0.0.0:52954 (78 connections now open)
2019-08-05T05:27:54.269+0000 I NETWORK  [listener] connection accepted from 10.0.0.0:52988 #308808 (79 connections now open)
2019-08-05T05:27:54.275+0000 I NETWORK  [listener] connection accepted from 10.0.0.0:7174 #308809 (80 connections now open)
2019-08-05T05:27:54.279+0000 I ACCESS   [conn308809] SASL SCRAM-SHA-1 authentication failed for Assist_Refactored_Code_DB on Refactored_Code_DB from client 10.0.0.:7174 ; UserNotFound: User "Assist_Refactored_Code_DB@Refactored_Code_DB" not found
2019-08-05T05:27:54.281+0000 I NETWORK  [worker-1] end connection 10.0.0.5:7174 (79 connections now open)
2019-08-05T05:27:54.342+0000 I NETWORK  [worker-1] end connection 10.0.0.6:57391 (78 connections now open)
2019-08-05T05:27:54.343+0000 I NETWORK  [listener] connection accepted from 10.0.0.0:57527 #308810 (79 connections now open)
2019-08-05T05:27:55.080+0000 I NETWORK  [worker-3] end connection 10.0.0.0:56021 (78 connections now open)
2019-08-05T05:27:55.081+0000 I NETWORK  [listener] connection accepted from 10.0.0.0:56057 #308811 (79 connections now open)
2019-08-05T05:27:56.054+0000 I NETWORK  [worker-1] end connection 10.0.0.0:59137 (78 connections now open)
2019-08-05T05:27:56.055+0000 I NETWORK  [listener] connection accepted from 10.0.0.0:59184 #308812 (79 connections now open)
2019-08-05T05:27:59.268+0000 I NETWORK  [worker-1] end connection 10.0.0.5:52988 (78 connections now open)
2019-08-05T05:27:59.270+0000 I NETWORK  [listener] connection accepted from 10.0.0.0:53047 #308813 (79 connections now open)
2019-08-05T05:27:59.343+0000 I NETWORK  [worker-3] end connection 10.0.0.6:57527 (78 connections now open)
2019-08-05T05:27:59.344+0000 I NETWORK  [listener] connection accepted from 10.0.0.0:57672 #308814 (79 connections now open)
2019-08-05T05:28:00.080+0000 I NETWORK  [worker-3] end connection 10.0.1.1:56057 (78 connections now open)
2019-08-05T05:28:00.081+0000 I NETWORK  [listener] connection accepted from 10.0.0.0:56116 #308815 (79 connections now open)
2019-08-05T05:28:01.054+0000 I NETWORK  [worker-3] end connection 10.0.0.0:59184 (78 connections now open)
2019-08-05T05:28:01.058+0000 I NETWORK  [listener] connection accepted from 10.0.0.0:59225 #308816 (79 connections now open)
2019-08-05T05:28:01.763+0000 I NETWORK  [listener] connection accepted from 10.0.0.0:7173 #308817 (80 connections now open)
2019-08-05T05:28:01.768+0000 I ACCESS   [conn308817] SASL SCRAM-SHA-1 authentication failed for Assist_Sharded_Database on Sharded_Database from client 10.0.0.0:7173 ; UserNotFound: User "Assist_Sharded_Database@Sharded_Database" not found
2019-08-05T05:28:01.770+0000 I NETWORK  [worker-3] end connection 10.0.0.0:7173 (79 connections now open)
2019-08-05T05:28:04.271+0000 I NETWORK  [worker-3] end connection 10.0.0.0:53047 (78 connections now open)
2019-08-05T05:28:04.272+0000 I NETWORK  [listener] connection accepted from 10.0.0.0:53083 #308818 (79 connections now open)
2019-08-05T05:28:04.283+0000 I NETWORK  [listener] connection accepted from 10.0.0.0:7105 #308819 (80 connections now open)
2019-08-05T05:28:04.287+0000 I ACCESS   [conn308819] SASL SCRAM-SHA-1 authentication failed for Assist_Refactored_Code_DB on Refactored_Code_DB from client 10.0.0.0:7105 ; UserNotFound: User "Assist_Refactored_Code_DB@Refactored_Code_DB" not found

Java Code block to connect to MongoDB is - Note:The below code supports multitenancy of MongoDB at Database level.Based on one of the parameter in every request,we will determine from which database to query from. The below code will work fine for Standalone MongoDB instance. 1) Application property

mongodb.uri=${mongoURI:mongodb://username:password@IPaddress:portNumber}
mongodb.defaultDatabaseName=assist
assist-server-address1 = IpAddress

2) Spring Boot Application

@SpringBootApplication
@ServletComponentScan
public class ServiceApplication extends RepositoryRestConfigurerAdapter {
    public static void main(String[] args) {
        SpringApplication.run(ServiceApplication.class, args);
    }

    @Autowired
    public MongoDBCredentials mongoDBCredentials;

    @Bean
    public MongoTemplate mongoTemplate() {
        return new MongoTenantTemplate(
                new SimpleMongoDbFactory(new MongoClient(new MongoClientURI(mongoDBCredentials.getUri())),
                        mongoDBCredentials.getDefaultDatabaseName()));
    }

}

3)MongoTemplate -> This is used to establish Mongo connection.We have implemented Multitenant Mongo(To connect to multiple database based on one of the parameter of the request)

public class MongoTenantTemplate extends MongoTemplate {
	private static Map tenantTemplates = new HashMap();

	@Value("${assist-server-address1}")
	public String ServerAddress1;

	@Value("${spring.data.mongodb.username}")
	public String ServerUsername;

	@Value("${spring.data.mongodb.password}")
	public String ServerPassword;

	@Value("${spring.data.mongodb.database}")
	public String ServerDbName;

	@Value("${assist-current-environment}")
	public String currentEnv;

	@Autowired
	public MongoDBCredentials mongoDBCredentials;

	@Autowired
	WorkspacesRepository workspaceRepository;

	private static final Logger LOG = LoggerFactory.getLogger(MongoTenantTemplate.class);
	Marker marker;

	public MongoTenantTemplate(MongoDbFactory mongoDbFactory) {
		super(mongoDbFactory);
		tenantTemplates.put(mongoDbFactory.getDb().getName(), new MongoTemplate(mongoDbFactory));
	}

	protected MongoTemplate getTenantMongoTemplate(String tenant) {

		MongoTemplate mongoTemplate = tenantTemplates.computeIfAbsent(tenant, k -> null);
		LOG.info(marker, "Tenant is (MongoDBCredentials) : {}",tenant);
		try {
			if (mongoTemplate == null) {
				MongoCredential mongoCredential;
				// Username,databaseName,password
				if (tenant == ServerDbName) {
					mongoCredential = MongoCredential.createCredential(ServerUsername, ServerDbName,
							ServerPassword.toCharArray());
				} else {
					Workspaces workspace = workspaceRepository.findByDbName(tenant);
					mongoCredential = MongoCredential.createCredential(workspace.getDbUserName(), workspace.getDbName(),
							workspace.getDbPassword().toCharArray());
				}
				ServerAddress address1 = new ServerAddress(ServerAddress1, port);

				List serverAddressList = new ArrayList();
				serverAddressList.add(address1);

				SimpleMongoDbFactory mongoDbFactory = new SimpleMongoDbFactory(
						new MongoClient(serverAddressList, Arrays.asList(mongoCredential)), tenant);

				mongoTemplate = new MongoTemplate(mongoDbFactory);
			}
			else {
			}
		} catch (Exception e) {
			tenantTemplates.remove(tenant);
		}
		return mongoTemplate;
	}
...

In the above logs,there is an error in authentication to Assist_Refactored_Code_DB(This database is not created by me).Im not sure why this authentication is failing and in which mongo URI the username and password should be mentioned.And Im also not sure whether this is one of the reason for stalling or not. This is the only error logs that I could find in mongos.All other logs in config server and shard mongo doesnt have any errors. I expect the sharded mongodb to not stall at any point of time and work similar to standalone mongodb. Can anyone guide me to resolve the stalling of sharded mongodb issue?

Prajwal M (11 rep)

Aug 6, 2019, 11:14 AM • Last activity: Jul 31, 2025, 08:04 AM

0 votes

1 answers

145 views

How to check mongodb time series collection shard distribution

mongodb sharding

[direct: mongos] test> db.weather.getShardDistribution() MongoshInvalidInputError: [SHAPI-10001] Collection weather is not sharded we are getting this error and how to check the sharded collection data distribution for time series collection. Please share the details with query output if any one hav...

                                      [direct: mongos] test> db.weather.getShardDistribution()
    MongoshInvalidInputError: [SHAPI-10001] Collection weather is not sharded

we are getting this error and how to check the sharded collection data distribution for time series collection.
Please share the details with query output if any one have already working on sharded time series collection.

Below is the command i used to create time series sharded collection
sh.shardCollection( “test.weather”, { “uid”: 1 }, { timeseries: { timeField: “et”, metaField: “uid” ,granularity: “hours” } })

santhosha MS

Jan 16, 2024, 08:39 AM • Last activity: Jul 20, 2025, 07:00 PM

0 votes

1 answers

150 views

Mysql Sharding. Application Layer vs Mysql Layer

mysql best-practices sharding

I am going to create shards in mysql table and I am stuck at a dileman whether to go for application layer sharding or let MySQL handle it. I mean, wether I should be figuring out which shard to go and query at application layer or pass on the query to the MySQL driver and let it decide. I read abou...

                                  I am going to create shards in mysql table and I am stuck at a dileman whether to go for application layer sharding or let MySQL handle it. 
I mean, wether I should be figuring  out which shard to go and query at application layer or pass on the query to the MySQL driver and let it decide.

I read about some pros and cons of both here  but I am unable to come up with a conclusion.

My personal opinion is to go for the MySQL driver handling it. Open for discussion on this.

Please share your views/experiences.

Sankalp Mishra (101 rep)

Nov 5, 2019, 03:20 AM • Last activity: Jul 16, 2025, 11:03 AM

0 votes

1 answers

159 views

Mongo zoned sharding with a single value

mongodb sharding

We have a database with around 50 collections, and 70 million records. All records belong to specific customers and have a `customerId` property. Currently all clients are in the USA. However, we're in the process of adding EU clients and would like to host their data in an EU datacenter. Since all...

                                  We have a database with around 50 collections, and 70 million records. All records belong to specific customers and have a customerId property. Currently all clients are in the USA. However, we're in the process of adding EU clients and would like to host their data in an EU datacenter. 

Since all records have a customerId (which is a string) and virtually all of our queries also specify this customerId this makes it a good choice for a shard key for us - we're only going to have one shard per zone, and we'd like all of a customers data to be in that one shard. 

My question is, given that the customerId's are strings, how do we specify a minimum and maximum for the sh.updateZoneKeyRange() function? Obiously minimum will be the customerId (e.g., "customer-somename") but how do we specify a maximum of "customer-somename" + 1? The issue here is that maximum is exclusive, so it can't be the same as minimum

Zack Newsham (135 rep)

Apr 16, 2019, 05:11 PM • Last activity: Jul 15, 2025, 06:07 AM

0 votes

1 answers

153 views

MySQL Sharding with org id

mysql sharding

I have a multi tenant application and each tenant will have an orgId. So, all tables will have orgid column. Each org will have clients, projects, ... **Tables** org client project Now, I am trying to understand sharding here. So, I will have a metadata database where org to DB mapping is present. A...

                                  I have a multi tenant application and each tenant will have an orgId. So, all tables will have orgid column.

Each org will have clients, projects, ...

**Tables**

    org
    client
    project

Now, I am trying to understand sharding here.

So, I will have a metadata database where org to DB mapping is present.
And, will have multiple DBs. Each DB will host few org data.

My questions is,

Say, 

DB A --> is a metadata DB.
DB B --> Hosting org data for org 1 & 2
DB C --> Hosting org data for org 3 & 4
DB D --> Hosting org data for org 4 & 5 & 6

**DB A**

Metadata DB will have a table for ORG to DB mapping.

So,

    **Table :-** org_db
    **Columns :-**. org_id, db

In, "DB B",

we have a project table and id starts from 1 and this table will have org_id as well.

In, "DB C",

we have a project table and id starts from 1 and this table will have org_id as well.

Now, I feel that org 3 & 4 data is small, so DB C can be merged with DB B.

If that is the case, How can we merge the project Id as project id start from 1 in both the DB's.

user1578872 (79 rep)

Jun 11, 2021, 04:33 PM • Last activity: Jul 11, 2025, 04:10 PM

1 votes

1 answers

1997 views

Mongodb: Determining shard key strategy on compound index

mongodb sharding mongodb-3.4

I have a collection with `170 millions+` documents and it is only going to increase. The size of the collection is not that huge, currently around `70 GB`. The collection has two fields **indexed on: {AgentId:1, PropertyId:1}**. Generally one imports a huge file(millions of documents) belonging to a...

                                   I have a collection with 170 millions+ documents and it is only going
 to increase. The size of the collection is not that huge, currently
 around 70 GB.
 
 The collection has two fields **indexed on: {AgentId:1, PropertyId:1}**.
 Generally one imports a huge file(millions of documents) belonging to
 a particular **AgentId but the PropertyId(non numeric nullable) is
 mostly random unique value.**
 
 Currently I have **two shards with shard key based on {_id: hashed}**. But
 I am planning to **change the shard key to  compound Index {AgentId:1,
PropertyId:1}** because I think it will improve query performance( most
of the queries are based on AgentId filter). Not sure whether one can
have a nullable field in the shard key. If this is the case then app
 will make sure that the PropertyId is random no.
 
 **So looking to get a picture as to** 
 
  1. How the data will be distributed to shards during insertion
     and how the range of a chunks are calculated during insertion?
  2. Since the PropertyId is random value. Does the compound key fits the 
      definition of monotonically increasing value? 

 
I am a newbie to MongoDB. And wanted to know if I am on the right path?

                                

Sameer (123 rep)

Dec 26, 2017, 05:41 PM • Last activity: Jul 8, 2025, 11:08 PM

1 votes

2 answers

591 views

MongoDB specifying shard key and sharding strategy

mongodb sharding

What is the syntax to state that a collection is to be sharded as location based, while using the range based sharding strategy. I know you have to use the the sh.shardCollection() command, but what arguments do I use in it?

                                  What is the syntax to state that a collection is to be sharded as location based, while using the range based sharding strategy. I know you have to use the the sh.shardCollection() command, but what arguments do I use in it?
                                

tom dinh (79 rep)

Mar 20, 2019, 09:38 PM • Last activity: Jul 8, 2025, 08:04 PM

0 votes

1 answers

200 views

Temporal service failed with "Persistent store operation failure", "Failed to update shard"

cassandra sharding

Hi i'm using Temporal service framework to orchestrate workflow process and cassandra database as datastore to record the workflow stateful process. My temporal is getting error when trying to update shard with detail as follow: ``` { "level": "error", "ts": "2024-02-02T02:54:57.744Z", "msg": "Persi...

{
    "level": "error",
    "ts": "2024-02-02T02:54:57.744Z",
    "msg": "Persistent store operation failure",
    "service": "history",
    "shard-id": 32,
    "address": "10.59.68.74:7234",
    "shard-item": "0xc00277ea00",
    "store-operation": "update-shard",
    "error": "Failed to update shard.  previous_range_id: 30415, columns: (range_id=30414)",
    "shard-range-id": 30416,
    "previous-shard-range-id": 30415,
    "logging-call-at": "context_impl.go:732"
}

Do you have any experience on how to deal with this?

Ahmad Ricky Nazarrudin (1 rep)

Mar 26, 2024, 03:29 PM • Last activity: Jun 27, 2025, 02:04 PM

2 votes

1 answers

790 views

Chunks not splitting on MongoDB sharded clusters

mongodb linux sharding

I have created 2 shard servers named rs1 and rs2. Now in the database collection there are multiple records those should split on these 2 servers in chunks. The sharding key is `TxnMonth`. It is not unique. Now I have given the chunk size as 1 MB. When I inserted records corresponding to `"TxnMonth"...

                                  I have created 2 shard servers named rs1 and rs2.
Now in the database collection there are multiple records those should split on these 2 servers in chunks. The sharding key is TxnMonth. It is not unique.

Now I have given the chunk size as 1 MB. When I inserted records corresponding to "TxnMonth" : 1, "TxnMonth" : 2 and "TxnMonth" : 3, the data was splitted onto 2 shards where on rs1 there were 2 chunks and on rs2 there was 1 chunk.

On checking, data for TxnMonth : 1 and 2 was on rs1 and data for TxnMonth : 3 was on rs2.

On running, sh.status(), it gave this output-

    {  "_id" : "shardPOC",  "primary" : "rs1",  "partitioned" : true }
                    shardPOC.machines
                            shard key: { "TxnMonth" : 1 }
                            unique: false
                            balancing: true
                            chunks:
                                    rs1     2
                                    rs2     1
                            { "TxnMonth" : { "$minKey" : 1 } } -->> { "TxnMonth" : 1 } on : rs1 Timestamp(2, 1)
                            { "TxnMonth" : 1 } -->> { "TxnMonth" : 3 } on : rs1 Timestamp(1, 2)
                            { "TxnMonth" : 3 } -->> { "TxnMonth" : { "$maxKey" : 1 } } on : rs2 Timestamp(2, 0)
            {  "_id" : "NewTestDB",  "primary" : "rs1",  "partitioned" : false }

After this, I started inserting records for TxnMonth : 4, 5, 6, 7 and 8, to check when the chunk on rs2 will split but even if the chunk size exceeded 1 MB size, it didn't split. Below is the output of db.collection.getShardDistribution() -

    Shard rs1 at rs1/JCB-DB1:27012
       data : 57KiB docs : 80 chunks : 2
       estimated data per chunk : 28KiB
       estimated docs per chunk : 40
    Shard rs2 at rs2/JCB-DB1:27013
       data : 1.1MiB docs : 1560 chunks : 1
       estimated data per chunk : 1.1MiB
       estimated docs per chunk : 1560
    Totals
       data : 1.15MiB docs : 1640 chunks : 3
       Shard rs1 contains 4.87% data, 4.87% docs in cluster, avg obj size on shard : 740B
       Shard rs2 contains 95.12% data, 95.12% docs in cluster, avg obj size on shard : 740B

What I expected was that the chunk will split after it exceeds 1 MB but it did not.
Can someone explain the reason?
Thanks in advance.
                                

Venus Garg (21 rep)

Jul 5, 2017, 11:02 AM • Last activity: Jun 24, 2025, 02:01 PM

0 votes

1 answers

202 views

mongos is not listen when configured for sharded architecture (version 3.4)

mongodb sharding

Deal all, I am trying to mount a sharded mongodb server. I configured a replica set, and a configure replicat set. Both replicate are composd from 3 servers root 23792 1 0 14:43 ? 00:01:55 mongod --replSet rs01 --journal --port 27102 --shardsvr --dbpath /mongo/rs01i01/ --logpath /mongo/rs01i01/mongo...

                                  Deal all,
I am trying to mount a sharded mongodb server. I configured a replica set, and a configure replicat set. Both replicate are composd from 3 servers

    root     23792     1  0 14:43 ?        00:01:55 mongod --replSet rs01 --journal --port 27102 --shardsvr --dbpath /mongo/rs01i01/ --logpath /mongo/rs01i01/mongo_102.log --fork
    root     23820     1  0 14:44 ?        00:01:56 mongod --replSet rs01 --journal --port 27103 --shardsvr --dbpath /mongo/rs01i02/ --logpath /mongo/rs01i02/mongo_103.log --fork
    root     23852     1  0 14:45 ?        00:01:21 mongod --replSet rs01 --journal --port 27101 --shardsvr --dbpath /mongo/rs01iar/ --logpath /mongo/rs01iar/mongo_101.log --fork
    root     25240     1  0 16:18 ?        00:01:37 mongod --replSet conf --journal --port 27201 --dbpath /mongo/config --logpath /mongo/config/mongo_201.log --fork
    root     25269     1  0 16:19 ?        00:01:37 mongod --replSet conf --journal --port 27202 --dbpath /mongo/config2 --logpath /mongo/config2/mongo_202.log --fork
    root     25297     1  0 16:19 ?        00:01:38 mongod --replSet conf --journal --port 27203 --dbpath /mongo/config3 --logpath /mongo/config3/mongo_203.log --fork
    


However, when I try in the last step, to start mongo, the process is lunched  (3 times :O ) but is not listening on the corresponding port. 

    mongos --port 27301 --configdb conf/127.0.0.1:27201,127.0.0.1:27202,127.0.0.1:27203 --logpath /mongo/mongos/mongos_27301.log --fork
    about to fork child process, waiting until server is ready for connections.
    forked process: 9803

 
but, mongo does not connect on that port

    mongo --port 27301
    MongoDB shell version v3.4.10
    connecting to: mongodb://127.0.0.1:27301/
    2017-11-01T22:56:00.898+0000 W NETWORK  [thread1] Failed to connect to 127.0.0.1:27301, in(checking socket for error after poll), reason: Connection refused
    2017-11-01T22:56:00.899+0000 E QUERY    [thread1] Error: couldn't connect to server 127.0.0.1:27301, connection attempt failed :
    connect@src/mongo/shell/mongo.js:237:13
    @(connect):1:6
    exception: connect failed

Please help

                                

lemahdois (1 rep)

Nov 1, 2017, 10:58 PM • Last activity: Jun 24, 2025, 11:09 AM

0 votes

2 answers

189 views

MongoDB Shard Zone Range Overlapping

mongodb sharding

I am configuring a test mongo setup with 2 shards, and I'm trying to add zone ranges to each shard: `sh.updateZoneKeyRange('mydb.test', {id:MinKey,ts:MinKey}, {id:MaxKey,ts:1548787704000}, 'cold')` `sh.updateZoneKeyRange('mydb.test', {id:MinKey,ts:1548787704000}, {id:MaxKey,ts:MaxKey}, 'hot')` First...

                                  I am configuring a test mongo setup with 2 shards, and I'm trying to add zone ranges to each shard:

sh.updateZoneKeyRange('mydb.test', {id:MinKey,ts:MinKey}, {id:MaxKey,ts:1548787704000}, 'cold')

sh.updateZoneKeyRange('mydb.test', {id:MinKey,ts:1548787704000}, {id:MaxKey,ts:MaxKey}, 'hot')

First command runs fine, but second tells me:

Zone range: { id: MinKey, ts: 1548787704000.0 } -->> { id: MaxKey, ts: MaxKey }  on  hot is overlapping with existing: { id: MinKey, ts: MinKey } -->> { id: MaxKey, ts: 1548787704000.0 }  on  cold

I thought the maximum bounds were exclusive and minimum bounds inclusive?

Thiatt (123 rep)

Jan 29, 2020, 07:54 PM • Last activity: Jun 23, 2025, 09:04 PM

0 votes

1 answers

214 views

Sharding AWS MySQL data

mysql sharding

I am working on a SaaS application using AWS MySQL. The application is for multiple organizations. We use the same database and tables to store multiple org data. We store `org_id` in most of the tables and use this in select, update and delete queries as `where org_id = ? `. ### DB Model * **Organi...

I am working on a SaaS application using AWS MySQL. The application is for multiple organizations. We use the same database and tables to store multiple org data. We store org_id in most of the tables and use this in select, update and delete queries as where org_id = ? . ### DB Model * **Organization**

org_id

* **User**

user_id
  org_id

* **client**

client_id
  org_id

* **client_contact**

client_id
  phone_number
  phone_type

* **project**

project_id
  org_id

* **user_project**

project_id
  user_id

The data is growing rapidly and we need to shard the data. We have customers from all over the world. If I use a different sharding id for different table, then a join might happen across the nodes. For example, for user, sharding can be on shard_id and for client, sharding can be on client_id and for project, sharding can be on project_id. Here, the problem would be joining project and user. The user_id and project_id might be on different nodes and join will happen across nodes. What is the best approach here? I am thinking of sharding based on org_id as I store org_id in most of the tables. I see two problems here: 1. A few child tables don't store org_id as the parent table is storing. Do I need to store the org_id in all the tables? 2. Some orgs might have more data and load which might lead to a hot spot and more storage on a particular node. Is it possible scale a particular node alone with AWS RDS? Please suggest the best approach. **Note** Each org might have up to 500,000 records in any table with around 10 columns. We could reach more than 5,000 organizations in 6 months. 500,000 multiplied by 5,000 orgs will reach 2.5 billion records.

user1578872 (79 rep)

May 16, 2021, 03:37 PM • Last activity: Jun 21, 2025, 05:04 AM

0 votes

1 answers

34 views

MongoDB refineShardKey unbalanced chunks

mongodb sharding

As a start: - Shard key is compounded to: group > user > document_name, was only for group - The cluster is composed of 2 shards - There is about a billion entries - Mongo 6.0 is being used - The database itself seems to work fine After initiating the admin command for [refineCollectionShardKey](htt...

                                  As a start:
- Shard key is compounded to: group > user > document_name, was only for group
- The cluster is composed of 2 shards
- There is about a billion entries
- Mongo 6.0 is being used
- The database itself seems to work fine

After initiating the admin command for [refineCollectionShardKey](https://www.mongodb.com/docs/manual/reference/command/refineCollectionShardKey/#mongodb-dbcommand-dbcmd.refineCollectionShardKey) , the balancer kept only increasing one shard which resulted in the following status:

    'work.FILE': {
        shardKey: { 'user.group': 1, 'user.name': 1, document_name: 1 },
        unique: false,
        balancing: true,
        chunkMetadata: [
          { shard: 'RepSet1', nChunks: 26 },
          { shard: 'RepSet2', nChunks: 4248 }
        ],

With this:

    {
      'Currently enabled': 'yes',
      'Currently running': 'no',
      'Failed balancer rounds in last 5 attempts': 0,
      'Migration Results for the last 24 hours': {
        '13': "Failed with error 'aborted', from RepSet1 to RepSet2",
        '4211': 'Success'
      }
    }

Also:

    Shard RepSet1
    {
      data: '1179.56GiB',
      docs: Long("2369155472"),
      chunks: 26,
      'estimated data per chunk': '45.36GiB',
      'estimated docs per chunk': 91121364
    }
    ---
    Shard RepSet2
    {
      data: '1179.78GiB',
      docs: 2063496305,
      chunks: 4248,
      'estimated data per chunk': '284.39MiB',
      'estimated docs per chunk': 485757
    }
    ---
    Totals
    {
      data: '2359.35GiB',
      docs: 4432651777,
      chunks: 4274,
      'Shard RepSet1': [
        '49.99 % data',
        '53.44 % docs in cluster',
        '534B avg obj size on shard'
      ],
      'Shard RepSet2': [
        '50 % data',
        '46.55 % docs in cluster',
        '613B avg obj size on shard'
      ]
    }

Example of document:

    { 
      "_id" : ObjectId("5a68e11aedd1713655361323"), 
      "size" : 524288000, 
      "file" : {
        "archive_time" : "2018-01-24T19:40:10.717646",
        "checksum" : null,
        "creation_time" : "2017-12-22T14:44:12.643444",
        "part" : ".part000000",
        "tags" : null,
        "size" : 524288000,
        "group" : "root",
        "filename" : "data/docker_compose/prod_test/data_set/file500",
        "owner" : "root", 
        "inode" : 1310730 
      }, 
      "document_name" : "file500",
      "folder" : "priv/s002/_1",
      "user" : {
        "group" : "priv",
        "rgrp" : "global_n",
        "name" : "s002"
      }
    }

Looking into the logs 12 out of these 13 failed errors where from moveChunk:

"msg":"Error while doing moveChunk","attr":{"error":"ChunkTooBig: Cannot move chunk: the maximum number of documents for a chunk is 497102, the maximum chunk size is 134217728, average document size is 540. Found 501324 documents in chunk

The highest chunk document count is about 600k which is not so far from the threshold.

The other error was:

"msg":"Error while doing moveChunk","attr":{"error":"Interrupted: Failed to contact recipient shard to monitor data transfer :: caused by :: operation was interrupted"}}

I have a plenty of storage space, enough cores too. Memory might be low:


My questions:
- Is there an issue? I believe the balancing is wrong considering the big difference between nChunks. If so why? Obviously this seems to be the issue:   

    chunks: 26,
      'estimated data per chunk': '45.36GiB',
      'estimated docs per chunk': 91121364

- If the balancer blocked because of the chunks being too big, what option do I have? 
- How to confirm that the sharding has been properly done?
- Suggestions?

Thanks

Edit:

To further detail about monotonality, the only thing that could cause this would the config replica set:
database: { _id: 'config', primary: 'config', partitioned: true },

    collections: {
      'config.system.sessions': {
        shardKey: { _id: 1 },
        unique: false,
        balancing: true,
        chunkMetadata: [
          { shard: 'RepSet1', nChunks: 512 },
          { shard: 'RepSet2', nChunks: 512 }
        ],
                                

Pobe (101 rep)

Jun 5, 2025, 12:34 PM • Last activity: Jun 7, 2025, 06:43 AM

1 votes

1 answers

636 views

Why findOne() hangs on a sharded collection?

mongodb sharding mongodb-3.6

A MongoDB v3.6.4 database with a large sharded collection and WiredTiger engine hangs when issuing a `findOne()` or `find()` command without any parameters. The collection has only the default index on the `_id` key. If `findOne()` or `find()` is invoked on any of the shards then it works immediatel...

                                  A MongoDB v3.6.4 database with a large sharded collection and WiredTiger engine hangs when issuing a findOne() or find() command without any parameters. The collection has only the default index on the _id key.

If findOne() or find() is invoked on any of the shards then it works immediately. If it is invoked on a mongos then it hangs. Why this query hangs?

By looking with db.currentOp() MongoDB uses COLLSCAN.

The funny thing is that this worked and then suddenly stopped yesterday.

As a note, there is a sharding migration process which moves chunks, but that doesn't seem to influence as even when chunk balancing is stopped or when it is finished, still findOne() hangs (or doesn't hang but scans all the collection).

Here is the output of db.currentOp() taken from one primary:

                {                                                                                                                                                                                         
                        "host" : "db1:27017",
                        "desc" : "conn277",
                        "connectionId" : 277,
                        "client" : "10.240.137.10:57294",
                        "appName" : "MongoDB Shell",
                        "clientMetadata" : {
                                "application" : {
                                        "name" : "MongoDB Shell"
                                },
                                "driver" : {
                                        "name" : "MongoDB Internal Client",
                                        "version" : "3.6.4"
                                },
                                "os" : {
                                        "type" : "Linux",
                                        "name" : "PRETTY_NAME=\"Debian GNU/Linux 9 (stretch)\"",
                                        "architecture" : "x86_64",
                                        "version" : "Kernel 4.9.0-6-amd64"
                                },
                                "mongos" : {
                                        "host" : "m1:27017",
                                        "client" : "10.240.0.0:38190",
                                        "version" : "3.6.4"
                                }
                        },
                        "active" : true,
                        "currentOpTime" : "2018-04-28T08:13:26.371+0200",
                        "opid" : 2148656,
                        "secs_running" : NumberLong(4),
                        "microsecs_running" : NumberLong(4344989),
                        "op" : "query",
                        "ns" : "somedb.somecol",
                        "command" : {
                                "find" : "somecol",
                                "limit" : NumberLong(1),
                                "shardVersion" : [
                                        Timestamp(188696, 1),
                                        ObjectId("5ac6b2abbd8bbc9f42f34a39")
                                ],
                                "$clusterTime" : {
                                        "clusterTime" : Timestamp(1524896001, 1),
                                        "signature" : {
                                                "hash" : BinData(0,"RuV/v6Qm7H9AvPMVRNH0jkdIwRM="),
                                                "keyId" : NumberLong("6540622458888650772")
                                        }
                                },
                                "$client" : {
                                        "application" : {
                                                "name" : "MongoDB Shell"
                                        },
                                        "driver" : {
                                                "name" : "MongoDB Internal Client",
                                                "version" : "3.6.4"
                                        },
                                        "os" : {
                                                "type" : "Linux",
                                                "name" : "PRETTY_NAME=\"Debian GNU/Linux 9 (stretch)\"",
                                                "architecture" : "x86_64",
                                                "version" : "Kernel 4.9.0-6-amd64"
                                        },
                                        "mongos" : {
                                                "host" : "m1:27017",
                                                "client" : "10.240.0.0:38190",
                                                "version" : "3.6.4"
                                        }
                                },
                                "$configServerState" : {
                                        "opTime" : {
                                                "ts" : Timestamp(1524896001, 1),
                                                "t" : NumberLong(3)
                                        }
                                },
                                "$db" : "feedback"
                        },
                        "planSummary" : "COLLSCAN",
                        "numYields" : 803,
                        "locks" : {
                                "Global" : "r",
                                "Database" : "r",
                                "Collection" : "r"
                        },
                        "waitingForLock" : false,
                        "lockStats" : {
                                "Global" : {
                                        "acquireCount" : {
                                                "r" : NumberLong(1608)
                                        }
                                },
                                "Database" : {
                                        "acquireCount" : {
                                                "r" : NumberLong(804)
                                        }
                                },
                                "Collection" : {
                                        "acquireCount" : {
                                                "r" : NumberLong(804)
                                        }
                                }
                        }
                },

                                

ssasa (261 rep)

Apr 28, 2018, 06:49 AM • Last activity: Jun 5, 2025, 11:09 AM

0 votes

1 answers

231 views

MongoDB: slow removeShard

mongodb sharding

I have a MongoDB cluster with 9 nodes (3 shards, 3 nodes each). I'm now removing one shard, but the process itself is running extremely slow. Each node of the shard being deleted stores ~400Gb of data, which isn't too much, I suppose. But approximation shows that the process of draining finishes in...

                                  I have a MongoDB cluster with 9 nodes (3 shards, 3 nodes each). I'm now removing one shard, but the process itself is running extremely slow. Each node of the shard being deleted stores ~400Gb of data, which isn't too much, I suppose. But approximation shows that the process of draining finishes in 200+ days. 

I was wondering if there is a way to speed up this process. I have enough free resources (CPU, Mem, IO), I mean 3x more than nodes consume now. I've already looked at balancers settings like _secondaryThrottle or _waitForDelete without much success. 

MongoDB 4.4.13

Pavel Sapezhko (183 rep)

Mar 21, 2022, 10:14 AM • Last activity: Jun 3, 2025, 03:01 PM

0 votes

1 answers

250 views

MongoDB Tag-Aware Sharding Workflow

performance mongodb sharding

I wonder the work flow of sharding cluster in Tag-Aware Sharding. --- Sharding Status --- sharding version: { "_id" : 1, "version" : 3, "minCompatibleVersion" : 3, "currentVersion" : 4, "clusterId" : ObjectId("546df06d63a15917a8356f4e") } shards: { "_id" : "shard0000", "host" : "hadoop4:27017", "tag...

                                  I wonder the work flow of sharding cluster in Tag-Aware Sharding.

    --- Sharding Status --- 
      sharding version: {
    	"_id" : 1,
    	"version" : 3,
    	"minCompatibleVersion" : 3,
    	"currentVersion" : 4,
    	"clusterId" : ObjectId("546df06d63a15917a8356f4e")
    }
      shards:
    	{  "_id" : "shard0000",  "host" : "hadoop4:27017",  "tags" : [ 	"TR" ] }
    	{  "_id" : "shard0001",  "host" : "hadoop5:27018",  "tags" : [ 	"US" ] }
    	{  "_id" : "shard0002",  "host" : "hadoop6:27017",  "tags" : [ 	"OTHER" ] }

      databases:
    	{  "_id" : "admin",  "partitioned" : false,  "primary" : "config" }
    	{  "_id" : "inventory",  "partitioned" : true,  "primary" : "shard0000" }


In tag-aware sharding , with mongorestore command via mongos , first all data writes to PrimaryDatabase and then migrate the appropriate chunks ? 

Or it can fill the appropriate chunk according to tagRange at the beggining of the insert?

For example , when i insert {country:"US",keyword:"abc"} document , at first it goes to shard0001 or goes to the primary database shard0000  and then chunk migration to shard0001?

 

                                

Şeref Acet (121 rep)

Nov 20, 2014, 08:18 PM • Last activity: May 28, 2025, 11:01 AM

1 votes

1 answers

237 views

MongoDB as a log storage. Choosing shard key

mongodb sharding multi-tenant logging

I'm designing a log storage system based on MongoDB. I want to shard a log collection to increase ingestion and capacity (distribute writes to several machines) while allow fast search. I should be able to increase ingestion by adding more nodes to the cluster. My collection has following fields: **...

                                  I'm designing a log storage system based on MongoDB. 
I want to shard a log collection to increase ingestion and capacity (distribute writes to several machines) while allow fast search.
I should be able to increase ingestion by adding more nodes to the cluster.

My collection has following fields:

**Subsystem** - string, name of the application. E.g: "SystemA", "SystemB". ~ 100 unique values.

**Tenant** - string, the name of the deployment. It's used to separate logs from different application deployments / environments. E.g: "South TEST", "North DEV", "South PROD", "North PROD". ~ 20 unique values.

**Date** - timestamp.

**User** - string.

**SessionId** - guid, logically groups several related log records.

**Data** - BLOB, contains zipped data. Average size = 2Kb, maximum = 8Mb.

**Context** - array of key/value pairs. Both key and value are strings. It's used to store additional metadata associated with event.

The search could be performed by any combination of fields Subsystem, Date, User and Context. 
Tenant almost always will be specified.

The question is - **what shard key and sharding strategy will be better in that case?**

My suggestions:

The simplest case is to shard by Tenant, but it will cause highly uneven data distribution, because PROD environments generates much more logs than DEV.

"Tenant + Subsystem" seems to be better but still there are subsystems that generates much more logs than other subsystems.
And also subsystem is not mandatory - user can omit subsystem during search and search query will be broadcasted.

"SessionId" will cause even data distribution but search requests will be broadcasted to all nodes.

Philipp Bocharov (21 rep)

Jul 3, 2018, 12:13 PM • Last activity: May 27, 2025, 12:11 PM

1 votes

0 answers

24 views

Suggestion regarding Migrating and Sharding 1TB standalone PostgreSQL Cluster with Partitions and Multiple Databases, using Citus

sharding postgresql-15 citus

We're currently in the **evaluation/planning phase** evaluating Citus (open-source) as a sharding solution (multi tenant) for our existing PostgreSQL database cluster. We anticipate significant data growth with a new application feature and need to scale beyond our current single-node setup. Current...

                                  We're currently in the **evaluation/planning phase** evaluating Citus (open-source) as a sharding solution (multi tenant) for our existing PostgreSQL database cluster. We anticipate significant data growth with a new application feature and need to scale beyond our current single-node setup.

Current Setup:

Standalone single node PostgreSQL (open source). Approximately 1 TB of data. pgpool-II is used for load balancing with a primary and one standby for basic high availability. The database structure involves multiple databases within the cluster. Each database contains multiple schemas. Within schemas, we have numerous tables, functions, and other SQL objects. Many tables currently utilize standard PostgreSQL declarative partitioning (both range and hash methods) for performance management. Goal: We aim to migrate to a Citus cluster and implement database sharding to handle the expected large volume of new data, while also effectively incorporating our existing 1 TB of historical data.

We have reviewed some of the basic Citus documentation but have several key questions regarding the migration and ongoing management of our specific setup:

1. Handling Multiple Databases: Our current structure has data spread across multiple logical databases within the single PostgreSQL instance. How should this multi-database structure be handled when migrating to Citus? Is there a recommended approach for managing multiple logical databases within a Citus cluster from a migration and ongoing data distribution perspective?
2. Existing and Future Partitioning: We heavily use standard PostgreSQL range and hash partitioning on our tables. How are these existing partitioned tables and their data migrated into a Citus distributed table setup? Does Citus automatically handle the data from partitions? How is partitioning typically handled for new and ongoing data within a Citus distributed table? Can we still use time-based partitioning, for example, effectively within Citus shards?
3. Load Balancing and High Availability (pgpool/Replication): Can our existing pgpool-II setup be repurposed or used in conjunction with a Citus cluster for load balancing client connections? What are the recommended strategies for high availability and replication within a Citus cluster itself (for both coordinator and worker nodes), and load balancing?
4. Schema Distribution to Workers: When we distribute a table (e.g., using create_distributed_table), how does Citus handle the schema definition on the worker nodes? Does the full schema structure (including other non-distributed objects or schemas) get replicated to workers, or only the necessary table definitions for the shards?
5. Monitoring in a Distributed Environment: On our single node, we rely on standard PostgreSQL system views and functions like pg_stat_activity, pg_stat_statements, pg_control_checkpoint(), pg_buffercache, pg_stat_user_tables, pg_stat_bgwriter, etc., for monitoring and performance tuning. How do these tools work in a distributed Citus cluster, where data and activity are spread across multiple worker nodes? How do we get a comprehensive, cluster-wide view of performance and activity?
We would appreciate guidance, insights, or experiences from the community on these points as we plan our migration to Citus. 
**Also, please advise if there is another sharding solution that can be tried for our current setup.**
                                

Ramzy (11 rep)

May 16, 2025, 07:48 AM

1 votes

1 answers

550 views

Thousands foreign keys referenced to one table and other postgres scaling philosophy

postgresql performance partitioning foreign-key sharding

I'm inspecting one legacy and quite big database which is growing faster nowadays than before. Following query ````select count(*) from information_schema.tables where table_schema = 'public'```` returns a bit more than 120k. Despite this, most of them have same structure which is pretty simple, fur...

I'm inspecting one legacy and quite big database which is growing faster nowadays than before. Following query

select count(*) from information_schema.tables where table_schema = 'public'

returns a bit more than 120k. Despite this, most of them have same structure which is pretty simple, furthermore data in these tables are independent of each other. I reckon that database design extremely needs some modifications to be applied as soon as it possible, but would like to share my thoughts. There is one very regular table Users(id, name, ...) and many(~100k) others have foreign keys referenced to Users.id. The idea is very straightforward: each *user-story* has own tables to keep their data isolated, each table has its increment sequence and in general this looks like manual sharding. - As far as I know, Postgres have no limits for tables, but on the other hand - performance degradation is quite possible in this case? - pg_dump is a nightmare even now because needs lots of locks which makes OOM is quite possible as well. Moreover, Postgre have a limit for relations. I'm thinking about declarative partitioning, but under the hood server organizes storage in different files, so it will be same OS limits, but most important what will I win eventually? I do not need to SELECT data across thousand tables with same structure - all operations are logically separated to each other. In other words, everything boils down that I do not really need reorganize tables, but thousands foreign keys referenced to same column looks to me a huge issue. The one idea I have is simply get rid of these constraints and check id in a business logic or a trigger before each INSERT, but this is ugly in my opinion. What is best practices in such use cases? Please, share you thoughts. **Update 1** Thank you both for joining the discussion and sorry for the late reply due to unexpected personal circumstances :\ Give me one more try, please. Within our business logic we have a *Case* class where some telemetry is logging and discussion happens. I would say that this is sort of a chat(very specific) and an issue tracker. For each independent *Case* we have 3 tables in the public scheme and each table has a common prefix in the name and postfix with case id. Simplified data structure: Participants(user_id, role) Messages(id, user_id, body, date_time) Logs(id, user_id, incident) - ids are bigserial (each table has private auto sequence, so I cannot create one general table with one bigserial field) - user_ids are foreign keys referenced to Users.id There are 60k tables like these in the database for now. Other "half" of the db consists of other tables for different types of journals, but all of them, as far as I understand, have same primitive motivation: isolated auto int sequences and avoiding locks on inserts within independent *Cases* or *Journals*. What I came up with, but would be very grateful for the criticism. Just three common tables for participants, messages and logs to get everything together in just 3 tables, but: (obviously) - create a simple bigint field instead of a bigserial ones for ids - add case_id column to distinguish cases in each table - make a primary key = (case_id, id) (questionably) - create sequence for *each* case and use manual nextval('case_12345_id_seq') on insertions - create partitions by case_id Sound like a simple plan with additional benefits like a sharding, but here's what I'm not sure about: 1. As far as I know I have to create foreign key referenced to Users.id in each partition table - means same limits and it is not worth it even at first glance 2. How about locks? What will be locked on insertion? (whole table, just partition table for particular case_id) 3. 1+2 boils down to be max_locks_per_transaction endlessly huge value, otherwise even pg_dump still a nightmare or a DELETE from Users will possibly lock each partition table. 4. if it still works what I should prefer: declarative partitioning or using inheritance to achieve partitioning? **Update 2** I'm trying to compare 2 approaches: separate tables(like now in our db) and partitioned tables of one. What will happen if I drop a partition? As far as I understand this DROP locks exclusively the main table as well. Beside this, pg_dump is fairly impossible on such a huge database(with lots relations) in prod because acquires share locks and from what I know the one technique is dumping from asynchronously replicated,readonly database to make a backup from a backup to run some tests on a real data in a sandbox. The other important thing to me is a limit of relations, how experts deal with it? My guess is sharding, but which instance across all databases is the owner of a relation? Do I need to care about it? I strongly believe that I can't merge everything to just one regular collection of 3 tables distinguished by case_id like @LaurenzAlbe says and need to be prepared for rapid data growing. On the other hand I am ready to accept the fact that all of this is ocd, but that's why I'm thinking about a global database redesign and asking for a help. Thank you again.

danilabagroff (131 rep)

Jan 11, 2021, 04:16 PM • Last activity: May 12, 2025, 10:02 PM

Showing page 1 of 20 total questions