Database Administrators

Q&A for database professionals who wish to improve their database skills

Latest Questions

0 votes

1 answers

425 views

Load Balancing in mysql InnoDB cluster

mysql mysql-innodb-cluster load-balancing mysqlrouter

I have set up an InnoDB cluster with one primary(R/W) and two secondaries (R/O) and bootstrapped a MySQL Router. The router has 2 port 6646(R/W port) and 6647(R/O) port. My application is currently connected to the cluster through the R/W port. Now I am looking for a way to implement load balancing...

                                  I have set up an InnoDB cluster with one primary(R/W) and two secondaries (R/O) and bootstrapped a MySQL Router. The router has 2 port 6646(R/W port) and 6647(R/O) port. My application is currently connected to the cluster through the R/W port. Now I am looking for a way to implement load balancing in the cluster. Suppose there are 2 reading requests, how can I route the traffic to a different available database?
                                

Wong Chung sze (1 rep)

Nov 25, 2022, 08:56 AM • Last activity: May 1, 2025, 09:02 PM

1 votes

2 answers

953 views

Load Balancing multiple databases in a multi-tenant setup

mysql mariadb load-balancing

We have a multi-tenant setup where each client of ours runs a DB that is identical to every other client. We were using Mysql sofar and with CentOS 7.x ,switched to MariaDB. If we want to distribute the load so that requests for Databases 1-500 go to Node-A, request for Databases 501-1000 go to Node...

                                  We have a multi-tenant setup where each client of ours runs a DB that is identical to every other client. We were using Mysql sofar and with CentOS 7.x ,switched to MariaDB.

If we want to distribute the load so that requests for Databases 1-500 go to Node-A, request for Databases 501-1000 go to Node-B, would that be possible through some out of the box solution like haproxy? I looked into their documentation but the "balance" options seem to be about distributing the load based on pre-defined headers/values rather than my use-case above. Would it be possible to add a custom "balance" algorithm that will send the requests to the corresponding nodes?  

The other approach I was considering was to use a local key-value store that contains the information about which clients are in which Node and switching to those based on the request. I also want to have master-slave replication for these, but have already handled that through a proxy in my application.

Our application runs on PHP(Symfony Framework) with Mysql, if that matters.

How would you design a system that will have few thousand databases and be able to spread the load?

rajasaur (111 rep)

Nov 2, 2015, 06:12 PM • Last activity: Mar 26, 2025, 05:07 PM

2 votes

1 answers

1325 views

Load balancing between 2 servers using MySQL Router

mysql master-slave-replication load-balancing

I want to setup a simple master-slave MySQL replication. I also want to balance the load between the 2 servers. Can I use MySQL Router 2.1 for that purpose? As I read the official documentation, it seems that MySQL Router now only works with InnoDB cluster. I don't have a need for InnoDB cluster. I...

                                  I want to setup a simple master-slave MySQL replication. I also want to balance the load between the 2 servers. Can I use MySQL Router 2.1 for that purpose? As I read the official documentation, it seems that MySQL Router now only works with InnoDB cluster. I don't have a need for InnoDB cluster. I just want to load balance 2 servers. Is the Router a good way to go, or should I look elsewhere?
                                

Zirui Wang (131 rep)

Apr 10, 2018, 03:21 AM • Last activity: Mar 3, 2025, 10:05 AM

0 votes

1 answers

871 views

Load Balancing SSRS inside an AlwaysOn Availability Group

sql-server-2014 availability-groups ssrs failover load-balancing

We have 3 nodes in an Availability Group in SQL 2014: **Primary Site:** - Server1 (synchronous primary) - Server2 (synchronous, readable secondary) DR: - Server3 (async, nonreadable secondary) We use Reporting Services and have it installed and licensed on all 3 servers. When processing reports we u...

                                  We have 3 nodes in an Availability Group in SQL 2014:

**Primary Site:**

 - Server1 (synchronous primary)
 - Server2 (synchronous, readable secondary)

DR:

 - Server3 (async, nonreadable secondary)

We use Reporting Services and have it installed and licensed on all 3 servers. 

When processing reports we use the availability group name to connect to the server, so that when we failover reports will continue to work (e.g. we connect to http://sqlAAGListener/reports) . This has been working great for us, but we are getting to the stage where we'd like to balance the reports between the 2 servers on the primary site.

We have tested connecting directly to http://server1/reports  and http://server2/reports  and all of our reports work just fine on both servers.

MSDN suggests that the right way to do this is to have the report server database as part of the Availability Group (as we currently do), and then running 2 or more Reporting Services instances on separate servers and then put a load balancer (e.g. Windows Load Balancing service) in front of them. I see the advantages of doing this, however we don't have a budget for 2 more servers of Enterprise Edition, so it's not an option.

Instead, can we keep our SQL Connections pointing to the availability group listener, and install Windows Load Balancing on Server1 and Server2 to balance SSRS or will this cause problems?

Greg (3292 rep)

Dec 5, 2016, 04:17 AM • Last activity: Jan 14, 2025, 08:04 AM

3 votes

1 answers

381 views

Postgres 9.6 AWS RDS loadbalancing

postgresql load-balancing

I have an legacy application that has a number of very complex queries that take significant time and resource to execute. Rather than rewrite the application I am looking at the possibility of doing some form of load balancing. I firstly looked at writing a script with `pg_isready` to determine if...

                                  I have an legacy application that has a number of very complex queries that take significant time and resource to execute. Rather than rewrite the application I am looking at the possibility of doing some form of load balancing.

I firstly looked at writing a script with pg_isready to determine if the database is able to respond, if not to potentially switch to another replica. However, this approach does not give an indication of the current load on the database. 

I've read up on using haproxy to do some load balancing, but it seems as if it would suffer the same problem. My question is, has anyone faced a similar problem and found a neat solution. I would like to switch to a different replica is the database load is above a certain threshold. Is this at all possible ?

avrono (131 rep)

Jan 26, 2018, 10:01 PM • Last activity: Nov 16, 2023, 06:01 PM

0 votes

1 answers

57 views

Doubt on Always-on ApplicationIntend-ReadOnly and ApplicationIntend=ReadWrite

sql-server availability-groups load-balancing

I Have Deployed Always-on, and I have routed read query into secondary server, but my question is that, how can I make connection string on application side which will divert. Insert, Update and Delete to Primary server and Select query to Secondary server. Do I need to create two connection string?...

                                  I Have Deployed Always-on, and I have routed read query into secondary server, but my question is that, how can I make connection string on application side which will divert. Insert, Update and Delete to Primary server and Select query to Secondary server. Do I need to create two connection string? or one connection will be okay.......... I am in confusion any insight will be highly appreciated.
                                

kedar (23 rep)

Aug 1, 2023, 12:31 PM • Last activity: Aug 1, 2023, 12:48 PM

1 votes

2 answers

5928 views

Pgpool2 doesn't use the slave for load balancing

postgresql postgresql-9.5 pgpool load-balancing

I've enabled streaming replication in my postgres 9.5.10 installed from ubuntu xenial repo. I want to enable load balance for pgpool-2 (3.7.0 amefuriboshi) installed from sources. So I have this pgpool.conf: https://pastebin.com/qWWgejQN As you can see, I set both nodes, turned off replication, turn...

                                  I've enabled streaming replication in my postgres 9.5.10 installed from ubuntu xenial repo. I want to enable load balance for pgpool-2 (3.7.0 amefuriboshi) installed from sources. So I have this pgpool.conf: https://pastebin.com/qWWgejQN 

As you can see, I set both nodes, turned off replication, turned on master/slave mode and set it to stream. And also I enabled memcached caching.

Now I have very big problem that pgpool doesn't want to use slave for balancing:

    postgres=# show pool_nodes;
     node_id |    hostname    | port | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay 
    ---------+----------------+------+--------+-----------+---------+------------+-------------------+-------------------
     0       | localhost      | 5433 | up     | 0.500000  | primary | 0          | true              | 0
     1       | host2          | 5433 | unused | 0.500000  | standby | 0          | false             | 0
    (2 rows)

What am I do wrong?
                                

abr_stackoverflow (204 rep)

Nov 25, 2017, 11:59 AM • Last activity: Dec 23, 2021, 01:14 PM

0 votes

1 answers

625 views

HADR Availability Group long sync time

sql-server load-balancing high-availability

I have the following problem / challenge: We operate an HA SQL cluster with several Availability Groups. In order to take full advantage of the load balancing, I redirect the SQL selects to the secondary host so that the primary is not additionally loaded. This also works flawlessly and super fast....

                                  I have the following problem / challenge:

We operate an HA SQL cluster with several Availability Groups.

In order to take full advantage of the load balancing, I redirect the SQL selects to the secondary host so that the primary is not additionally loaded.

This also works flawlessly and super fast.

BUT...

If I do an INSERT, UPDATE or other dml out of an application and immediately after the result with a SELECT, the newly updated entry is not displayed correctly.

My suspicion is that the replication on the secondary is simply too slow.

Can it be because of the latency for hardening the log?

How can I speed it up completely?

Thomas Seidl (1 rep)

Oct 14, 2021, 10:25 AM • Last activity: Oct 15, 2021, 12:38 AM

1 votes

2 answers

3794 views

Load balancing reads SQL Server 2016 AG

availability-groups sql-server-2016 load-balancing

I'm trying to balance read traffic between two SQL Server 2016 Enterprise Edition instances (Primary and Secondary) using availability groups and the read only routing. Here is my current setup: * The servers are in AWS in two different AZ * The servers are in a failover cluster without shared stora...

                                  I'm trying to balance read traffic between two SQL Server 2016 Enterprise Edition instances (Primary and Secondary) using availability groups and the read only routing.  

Here is my current setup:

* The servers are in AWS in two different AZ
* The servers are in a failover cluster without shared storage
* AG is enabled on both instances
* The databases in an Availability Group have the "Readable Secondary" option set to Yes
* I've setup the AG with the read only routing on both the current primary and secondary instances
* Adding the read only switch in my application connection string allows me to route traffic to the secondary 

Is there a way to have the AG Listener load balance read only traffic between the primary and secondary?
                                

Aaron (1802 rep)

Apr 19, 2017, 06:21 PM • Last activity: May 18, 2021, 01:29 PM

7 votes

1 answers

5129 views

Confused about load balancing and horizontal scaling in postgresql

postgresql scalability cloud load-balancing

Please correct me if I am wrong but I guess handling more requests and load by adding more machines or balancing the load between multiple servers is horizontal scalling. So, if I add more servers, how do I distribute the database? Do I create one database to hold the user records with multiple serv...

                                  Please correct me if I am wrong but I guess handling more requests and load by adding more machines or balancing the load between multiple servers is horizontal scalling. So, if I add more servers, how do I distribute the database? Do I create one database to hold the user records with multiple servers? Or do I split the database too? What about database integrity? How to synchronize it? Or else what do I do? I am a newbie and really confused but eager to learn. I would like to use postgres for my project and would like to know some basic things before I start. I was thinking of using two small ec2 instances. But I got confused about the database. How do I go about creating the database. Do I need to go through sharding for this? What would be the best approach for horizontal scaling in accordance to postgres. I would really appreciate if you could explain it to me. Thank you!

**Edit:**

How do I load balance using multiple machines and manage the database? 

I have an app where users can upload videos and it will be converted to mp4 using Elastic Transcoder. Users about 10k. So, how do I load balance using multiple machines and manage the database? What I want to do is load balancing for the performance. And I read in many post that adding more machine can leverage it. So I though of horizontal scaling. But since horizontal scaling is scary, how do I load balance and manage my database?

Benjamin Smith Max (213 rep)

Feb 22, 2014, 11:24 PM • Last activity: Mar 28, 2021, 03:34 PM

22 votes

3 answers

37684 views

PostgreSQL High Availability/Scalability using HAProxy and PGBouncer

postgresql high-availability scalability pgbouncer load-balancing

I have multiple PostgreSQL servers for a web application. Typically one master and multiple slaves in hot standby mode (asynchronous streaming replication). I use PGBouncer for connection pooling: one instance installed on each PG server (port 6432) connecting to database on localhost. I use transac...

                                  I have multiple PostgreSQL servers for a web application.
Typically one master and multiple slaves in hot standby mode (asynchronous streaming replication).

I use PGBouncer for connection pooling: one instance installed on each PG server (port 6432) connecting to database on localhost. I use transaction pool mode.

In order to load-balance my read-only connections on slaves, I use HAProxy (v1.5) with a conf more or less like this:

    listen pgsql_pool 0.0.0.0:10001
            mode tcp
            option pgsql-check user ha
            balance roundrobin
            server master 10.0.0.1:6432 check backup
            server slave1 10.0.0.2:6432 check
            server slave2 10.0.0.3:6432 check
            server slave3 10.0.0.4:6432 check

So, my web application connects to haproxy (port 10001), that load-balance connections on multiple pgbouncer configured on each PG slave.

Here is a representation graph of my current architecture:

This works quite well like this, but I realize that some implements this quite differently: web application connects to a single PGBouncer instance that connects to HAproxy which load-balance over multiple PG servers:

What's the best approach? The first one (my current one) or the second one? Are there any advantages of one solution over the other?

Thanks

Nicolas Payart (2508 rep)

Jan 10, 2014, 04:55 PM • Last activity: Oct 27, 2020, 11:25 AM

5 votes

2 answers

2628 views

PGPool memory requirements

postgresql pgpool load-balancing

What amount of physical memory is recommended for a dedicated PGPool machine running connection pooling & load balancing, but no query cache? I see `num_init_children(96) * max_pool(2) * number_of_backends(2) = 384` lines in `SHOW pool_pools`; and the modal average seems to be about 99M per PID, wit...

                                  What amount of physical memory is recommended for a dedicated PGPool machine running connection pooling & load balancing, but no query cache? 

I see num_init_children(96) * max_pool(2) * number_of_backends(2) = 384 lines in SHOW pool_pools; and the modal average seems to be about 99M per PID, with a couple of 1G outliers


    # top for 20 pgpool processes
	$ top -p $(pgrep pgpool | head -20 | tr "\\n" "," | sed 's/,$//')

    Tasks:  20 total,   0 running,  20 sleeping,   0 stopped,   0 zombie
    %Cpu(s):  3.1 us,  4.0 sy,  0.0 ni, 92.2 id,  0.0 wa,  0.0 hi,  0.7 si,  0.0 st
    KiB Mem :  1784080 total,    22068 free,  1629960 used,   132052 buff/cache
    KiB Swap:  4194300 total,   437328 free,  3756972 used.    71276 avail Mem 
    
      PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND     
    15407 root      20   0   97828    872    708 S   0.3  0.0   5:52.95 pgpool      
     8076 root      20   0  101664   2800   1708 S   0.0  0.2   0:06.51 pgpool      
     9597 root      20   0  101656   4508   1264 S   0.0  0.3   0:18.85 pgpool      
    13603 root      20   0  149936  35548    984 S   0.0  2.0   1:13.56 pgpool      
    13634 root      20   0  151848  39364    984 S   0.0  2.2   1:00.25 pgpool      
    13677 root      20   0  149424  37812   1180 S   0.0  2.1   0:28.03 pgpool      
    13680 root      20   0  153416  34948   1180 S   0.0  2.0   0:32.97 pgpool      
    15397 root      20   0   97820    828    668 S   0.0  0.0   0:33.05 pgpool      
    15399 root      20   0   97884    240    132 S   0.0  0.0   1:41.48 pgpool      
    15402 root      20   0   93636     72      0 S   0.0  0.0   0:07.12 pgpool      
    15405 root      20   0   93636    280    172 S   0.0  0.0   0:42.39 pgpool      
    17121 root      20   0  101676   2016   1648 S   0.0  0.1   0:03.39 pgpool      
    17206 root      20   0   97824     72      0 S   0.0  0.0   0:00.00 pgpool      
    17207 root      20   0   97820    164     56 S   0.0  0.0   0:00.27 pgpool      
    21348 root      20   0 3871428 1.090g   1536 S   0.0 64.1 429:48.53 pgpool      
    21917 root      20   0  102672   2832   1696 S   0.0  0.2   0:49.46 pgpool      
    22117 root      20   0  101692   2868   1752 S   0.0  0.2   0:16.57 pgpool      
    22436 root      20   0  101692   4644   3464 S   0.0  0.3   0:31.12 pgpool      
    23037 root      20   0  101692   3776   1780 S   0.0  0.2   0:20.52 pgpool      
    23142 root      20   0  101664    980    936 S   0.0  0.1   0:08.60 pgpool      

                                

gregn (303 rep)

Jul 3, 2017, 02:21 PM • Last activity: Oct 15, 2020, 12:54 PM

0 votes

1 answers

78 views

Skewed Read Load on Mongo Replica Set

mongodb load-balancing

I have set up a mongo replica-set with one primary and two secondaries. The problem that I am facing is that the reads from application servers which are connecting with replica-set connection URL are invariably going to only one secondary thereby causing a huge skew in read load between the two sec...

                                  I have set up a mongo replica-set with one primary and two secondaries. The problem that I am facing is that the reads from application servers which are connecting with replica-set connection URL are invariably going to only one secondary thereby causing a huge skew in read load between the two secondaries.

Due to this skew, I am constrained for resources on one server while the resources on the other are getting wasted.

 


rs.status()

    {
    	"set" : "rs0",
    	"date" : ISODate("2020-09-08T19:39:20.394Z"),
    	"myState" : 1,
    	"term" : NumberLong(16),
    	"syncingTo" : "",
    	"syncSourceHost" : "",
    	"syncSourceId" : -1,
    	"heartbeatIntervalMillis" : NumberLong(2000),
    	"majorityVoteCount" : 2,
    	"writeMajorityCount" : 2,
    	"optimes" : {
    		"lastCommittedOpTime" : {
    			"ts" : Timestamp(1599593958, 2042),
    			"t" : NumberLong(16)
    		},
    		"lastCommittedWallTime" : ISODate("2020-09-08T19:39:18.908Z"),
    		"readConcernMajorityOpTime" : {
    			"ts" : Timestamp(1599593958, 2042),
    			"t" : NumberLong(16)
    		},
    		"readConcernMajorityWallTime" : ISODate("2020-09-08T19:39:18.908Z"),
    		"appliedOpTime" : {
    			"ts" : Timestamp(1599593959, 1176),
    			"t" : NumberLong(16)
    		},
    		"durableOpTime" : {
    			"ts" : Timestamp(1599593958, 2042),
    			"t" : NumberLong(16)
    		},
    		"lastAppliedWallTime" : ISODate("2020-09-08T19:39:19.138Z"),
    		"lastDurableWallTime" : ISODate("2020-09-08T19:39:18.908Z")
    	},
    	"lastStableRecoveryTimestamp" : Timestamp(1599593936, 300),
    	"lastStableCheckpointTimestamp" : Timestamp(1599593936, 300),
    	"electionCandidateMetrics" : {
    		"lastElectionReason" : "priorityTakeover",
    		"lastElectionDate" : ISODate("2020-08-11T17:18:08.040Z"),
    		"electionTerm" : NumberLong(16),
    		"lastCommittedOpTimeAtElection" : {
    			"ts" : Timestamp(1597166288, 246),
    			"t" : NumberLong(15)
    		},
    		"lastSeenOpTimeAtElection" : {
    			"ts" : Timestamp(1597166288, 246),
    			"t" : NumberLong(15)
    		},
    		"numVotesNeeded" : 2,
    		"priorityAtElection" : 2,
    		"electionTimeoutMillis" : NumberLong(10000),
    		"priorPrimaryMemberId" : 5,
    		"targetCatchupOpTime" : {
    			"ts" : Timestamp(1597166288, 394),
    			"t" : NumberLong(15)
    		},
    		"numCatchUpOps" : NumberLong(148),
    		"newTermStartDate" : ISODate("2020-08-11T17:18:08.074Z"),
    		"wMajorityWriteAvailabilityDate" : ISODate("2020-08-11T17:18:10.782Z")
    	},
    	"members" : [
    		{
    			"_id" : 3,
    			"name" : "1.1.1.1:27017",
    			"health" : 1,
    			"state" : 1,
    			"stateStr" : "PRIMARY",
    			"uptime" : 2427845,
    			"optime" : {
    				"ts" : Timestamp(1599593959, 1176),
    				"t" : NumberLong(16)
    			},
    			"optimeDate" : ISODate("2020-09-08T19:39:19Z"),
    			"syncingTo" : "",
    			"syncSourceHost" : "",
    			"syncSourceId" : -1,
    			"infoMessage" : "",
    			"electionTime" : Timestamp(1597166288, 383),
    			"electionDate" : ISODate("2020-08-11T17:18:08Z"),
    			"configVersion" : 32,
    			"self" : true,
    			"lastHeartbeatMessage" : ""
    		},
    		{
    			"_id" : 5,
    			"name" : "3.3.3.3:27017",
    			"health" : 1,
    			"state" : 2,
    			"stateStr" : "SECONDARY",
    			"uptime" : 3672,
    			"optime" : {
    				"ts" : Timestamp(1599593954, 3378),
    				"t" : NumberLong(16)
    			},
    			"optimeDurable" : {
    				"ts" : Timestamp(1599593954, 3378),
    				"t" : NumberLong(16)
    			},
    			"optimeDate" : ISODate("2020-09-08T19:39:14Z"),
    			"optimeDurableDate" : ISODate("2020-09-08T19:39:14Z"),
    			"lastHeartbeat" : ISODate("2020-09-08T19:39:19.238Z"),
    			"lastHeartbeatRecv" : ISODate("2020-09-08T19:39:20.261Z"),
    			"pingMs" : NumberLong(0),
    			"lastHeartbeatMessage" : "",
    			"syncingTo" : "1.1.1.1:27017",
    			"syncSourceHost" : "1.1.1.1:27017",
    			"syncSourceId" : 3,
    			"infoMessage" : "",
    			"configVersion" : 32
    		},
    		{
    			"_id" : 6,
    			"name" : "2.2.2.2:27017",
    			"health" : 1,
    			"state" : 2,
    			"stateStr" : "SECONDARY",
    			"uptime" : 3341,
    			"optime" : {
    				"ts" : Timestamp(1599593957, 2190),
    				"t" : NumberLong(16)
    			},
    			"optimeDurable" : {
    				"ts" : Timestamp(1599593957, 2190),
    				"t" : NumberLong(16)
    			},
    			"optimeDate" : ISODate("2020-09-08T19:39:17Z"),
    			"optimeDurableDate" : ISODate("2020-09-08T19:39:17Z"),
    			"lastHeartbeat" : ISODate("2020-09-08T19:39:18.751Z"),
    			"lastHeartbeatRecv" : ISODate("2020-09-08T19:39:20.078Z"),
    			"pingMs" : NumberLong(0),
    			"lastHeartbeatMessage" : "",
    			"syncingTo" : "1.1.1.1:27017",
    			"syncSourceHost" : "1.1.1.1:27017",
    			"syncSourceId" : 3,
    			"infoMessage" : "",
    			"configVersion" : 32
    		}
    	],
    	"ok" : 1,
    	"$clusterTime" : {
    		"clusterTime" : Timestamp(1599593959, 1329),
    		"signature" : {
    			"hash" : BinData(0,"dfdfdggjhkljoj+mvY8="),
    			"keyId" : NumberLong("897987897897987")
    		}
    	},
    	"operationTime" : Timestamp(1599593959, 1176)
    }


Please help me here. Is this something which is normally expected from a mongo replica-set cluster?

                                

The-Proton-Resurgence (101 rep)

Sep 8, 2020, 07:32 PM • Last activity: Sep 10, 2020, 08:27 AM

0 votes

1 answers

1066 views

Managing failover for MySQL nodes using HA Proxy

mysql replication failover load-balancing haproxy

We have an S1 M2->S2 setup of MySQL replicated nodes. These are now to be brought to the back-end of an HA Proxy server to split read from writes. We also intend to achieve automatic fail-over with this. However, write requests to be routed to M2 only when M1 fails. In usual scenario, we will be goo...

                                  We have an S1M2->S2 setup of MySQL replicated nodes. These are now to be brought to the back-end of an HA Proxy server to split read from writes. We also intend to achieve automatic fail-over with this. However, write requests to be routed to M2 only when M1 fails. In usual scenario, we will be good with not touching M2 at all. There seems to be no "balance" option in HA Proxy that switches to M2 only when M1 fails. 
Please suggest how this can be achieved using HA Proxy.

Writing round-robin across M1 and M2 is a time consuming solution to be taken up at this point in time. 
                                

gnyanendra (1 rep)

Oct 10, 2017, 07:41 AM • Last activity: Sep 2, 2020, 02:05 PM

2 votes

1 answers

273 views

How RDBMS solve C10K problems, it's also load balance?

mysql load-balancing

I know AP server meet [C10K problem][1] can use Load Balance to be solved. But RDBMS server in my experience is very difficult to make load balance, especially for **writing** db, it should meet the client 10000 problem Therefore there's a question 'How do RDBMS solve C10K problems?" It's also load...

                                  I know AP server meet C10K problem  can use Load Balance to be solved.  

But RDBMS server in my experience is very difficult to make load balance, especially for **writing** db, it should meet the client 10000 problem

Therefore there's a question 'How do RDBMS solve C10K problems?" It's also load balance?

kdm.J (41 rep)

Aug 28, 2020, 07:32 AM • Last activity: Aug 28, 2020, 08:00 AM

0 votes

0 answers

184 views

MySQL cluster with HAProxy balancer outside of local network

mysql mysql-cluster load-balancing haproxy

I am building a mysql cluster on local network. I have 3 nodes and would like to load balance them using HAProxy. It works fine when I place HAProxy on one of my nodes inside of the network. I bind its Ip (lets say 192.168.0.1) to the ips of the sql nodes (xxx.xxx.x.2 ,...3,...4). My question is: ho...

                                  I am building a mysql cluster on local network. I have 3 nodes and would like to load balance them using HAProxy. It works fine when I place HAProxy on one of my nodes inside of the network. I bind its Ip (lets say 192.168.0.1) to the ips of the sql nodes (xxx.xxx.x.2 ,...3,...4).

My question is: how do I bind HAProxy to my nodes if i want to place it outside of my local network, lets say, on AWS micro instance? I want to direct mysql pulls through aws to my 3 nodes that i have at home.

I have already forwarded port 3306 and can access it from outside no problem. But I only have one public Ip. How can I bind haproxy's ip to my public ip so that it connects to all 3 nodes. That is the part I cant wrap my head around.

Thank you for help in advance.

Nikita Voevodin (1 rep)

Apr 9, 2020, 06:45 PM

2 votes

1 answers

2243 views

Troubleshooting PgPool Query Lags

postgresql postgresql-9.2 pgpool load-balancing

I have been using PGpool-II for quite a while but just noticed an intermittent lag (slow request taking about 20s to execute) which I am able to replicate consistently. Before I explain here is a quick rundown the underlying architecture: - 2 Different Database (1 master, 2 replicas) using streaming...

                                  I have been using PGpool-II for quite a while but just noticed an intermittent lag (slow request taking about 20s to execute) which I am able to replicate consistently. Before I explain here is a quick rundown the underlying architecture:


 - 2 Different Database (1 master, 2 replicas) using streaming replication
 - Each of which interact with a respective pgpool server
 - 2 client applications balanced across 5 servers
 - Pgpool Load balance mode turned on

I was able to rule out slow queries by pointing the connections directly to the master server. I am able to consistently replicate the lag only when subsequently querying pgpool from a different client application in repetition, the lag usually happens at about the second iteration. I noticed that by turning connection_cache=off the lag is less frequent and not as bad but still happens nonetheless. I tried turning on pgpool logging to figure out the issue but after tailing pgpool.log there is just so much information that I have no idea what to look for and using grep specifically for the string ERROR yields nothing while the lag happens.

Here is the configuration: 


    listen_addresses = '*'
    port = 5432
    socket_dir = '/var/run/postgresql/'
    listen_backlog_multiplier = 2
    pcp_listen_addresses = '*'
    pcp_port = 9898
    pcp_socket_dir = '/var/run/postgresql/'
    backend_hostname0 = 'database3-master'
    backend_port0 = 5432
    backend_weight0 = 1
    backend_data_directory0 = '/var/lib/pgsql/data'
    backend_flag0 = 'DISALLOW_TO_FAILOVER'
    backend_hostname1 = 'database3-replica'
    backend_port1 = 5432
    backend_weight1 = 1
    backend_data_directory1 = '/var/lib/pgsql/data'
    backend_flag1 = 'DISALLOW_TO_FAILOVER'
    backend_hostname2 = 'database3-replica2'
    backend_port2 = 5432
    backend_weight2 = 1
    backend_data_directory2 = '/var/lib/pgsql/data'
    backend_flag2 = 'DISALLOW_TO_FAILOVER'
    enable_pool_hba = on
    pool_passwd = 'pool_passwd'
    authentication_timeout = 60
    ssl = off
    num_init_children = 32
    max_pool = 4
    child_life_time = 300
    child_max_connections = 0
    connection_life_time = 60
    client_idle_limit = 0
    log_destination = 'stderr'
    log_connections = on
    log_hostname = off
    log_statement = off
    log_per_node_statement = off
    log_standby_delay = 'none'
    syslog_facility = 'LOCAL0'
    syslog_ident = 'pgpool'
    debug_level = 1
    pid_file_name = '/var/run/pgpool/pgpool.pid'
    logdir = '/var/log/pgpool/'
    connection_cache = off
    replication_mode = off
    replicate_select = off
    insert_lock = on
    replication_stop_on_mismatch = off
    failover_if_affected_tuples_mismatch = off
    load_balance_mode = on
    ignore_leading_white_space = on
    black_function_list = 'nextval,setval,nextval,setval'
    allow_sql_comments = off
    master_slave_mode = on
    master_slave_sub_mode = 'stream'
    sr_check_period = 0
    sr_check_user = 'pgpool'
    sr_check_password = 'password'
    delay_threshold = 0
    follow_master_command = '/bin/echo %M > /tmp/postgres_master'
    health_check_period = 30
    health_check_timeout = 20
    health_check_user = 'pg_produser'
    health_check_password = '9password'
    health_check_max_retries = 0
    health_check_retry_delay = 1
    connect_timeout = 10000
    failover_command = '/etc/pgpool-II/failover.sh %d %H %P /tmp/postgresql.trigger.failover startup-pgpool4'
    fail_over_on_backend_error = on
    search_primary_node_timeout = 10
    recovery_user = 'pgpool'
    recovery_password = 'password'
    recovery_timeout = 90
    client_idle_limit_in_recovery = 0
    use_watchdog = on
    wd_hostname = 'pgpool3'
    wd_port = 9000
    wd_authkey = ''
    wd_escalation_command = '/bin/bash /etc/pgpool-II/pgpool-failover.sh'
    wd_lifecheck_method = 'heartbeat'
    wd_interval = 10
    wd_heartbeat_port = 9694
    heartbeat_destination0 = 'pgpool4'
    heartbeat_destination_port0 = 9694
    other_pgpool_hostname0 = 'pgpool4'
    other_pgpool_port0 = 5432
    other_wd_port0 = 9000
    relcache_expire = 0
    relcache_size = 256
    check_temp_table = on
    check_unlogged_table = on
    memory_cache_enabled = off


I have been seriously considering another solution but not finding much. Ideally what I want is some middleware that will split read/write requests to the master and slave respectively. HAPROXY doesn't have the ability to do this because it does not parse queries.

At the application level, is there an ideal solution for splitting read write queries? I would assume any query with an UPDATE/INSERT would go to the master server but I feel like I am missing something.

So to recap:

 1. Is there an efficient way to debug pgpool lags and if there is what
    should I be looking for? 
 1. If not, is there an ideal solution besides
    pgpool for splitting read/write queries to replica/master
    respectively?
 1. If not, what is the most reliable way to split
    read/write queries at the application level in a high level language
    agnostic explanation?
                                

Joseph Persie III (141 rep)

Feb 22, 2017, 11:52 PM • Last activity: Dec 7, 2019, 12:01 AM

2 votes

0 answers

387 views

How I can find out which postgresql I am connected into when behind a load balancer?

postgresql master-slave-replication load-balancing load-testing

I have the following database replication setup: [![enter image description here][1]][1] As you can see I do a master-slave replication and I use aload balancer in order to connect into read-only replicas. So I want to test this setup, hence I connect into my database into postgresql read-only repli...

                                  I have the following database replication setup:

As you can see I do a master-slave replication and I use aload balancer in order to connect into read-only replicas.

So I want to test this setup, hence I connect into my database into postgresql read-only replicas via Load Labalcer via psql command.

But how I can find out which replica I am connected into? I need that in order to perform a stress test.

Dimitrios Desyllas (873 rep)

Sep 26, 2019, 12:10 PM

0 votes

0 answers

137 views

Is it a good idea to have a read connection point to both write database and read replica?

postgresql replication amazon-rds master-slave-replication load-balancing

In my laravel app I have a write connection to a write database, and a read connection to a dns server that is connected to a single read replica (as [instructed here][1]) In the past the dns server was connected to two read replicas but I removed one to cut costs. It happened that the only read rep...

                                  In my laravel app I have a write connection to a write database, and a read connection to a dns server that is connected to a single read replica (as instructed here ) In the past the dns server was connected to two read replicas but I removed one to cut costs.

It happened that the only read replica I had had a physical failure and had to restart, in that time my whole system went down. I was wondering if it's a good idea to make read connection point to the dns server, which in turn points to both the read replica that I have and the write database as depicted here:

abbood (503 rep)

Aug 28, 2019, 08:46 AM • Last activity: Aug 28, 2019, 10:09 AM

0 votes

1 answers

169 views

Recommended replication strategy for fast global access to MySQL DB

mysql replication load-balancing

The app i'm working on has users from all over the world and due to the nature of the app, i need to minimize all delays. My application servers are located in different data centers across the world and load balancing is in place. The load of the DB isn't that high, so that's not an issue, the issu...

                                  The app i'm working on has users from all over the world and due to the nature of the app, i need to minimize all delays. My application servers are located in different data centers across the world and load balancing is in place. The load of the DB isn't that high, so that's not an issue, the issue is simply that i need to minimize delays across long distances somehow.

Currently, there's a single DB server and it's located in Europe and the issue is when my app server in South-East Asia requests data from the DB in Europe, the complete duration for the query easily increases tenfold (from 50ms to 500ms).

My initial idea is to add another DB server in Asia, what kind of replication should i set in that case? Or is this even a good way to go?

Any guidelines and suggestions for this scenario are welcome.

IanDess (103 rep)

May 11, 2019, 03:54 PM • Last activity: May 11, 2019, 04:49 PM

Showing page 1 of 20 total questions