Sample Header Ad - 728x90

Unix & Linux Stack Exchange

Q&A for users of Linux, FreeBSD and other Unix-like operating systems

Latest Questions

0 votes
2 answers
2793 views
How do I mount a disk on /var/log directory even if I have process writing on it?
I would like to mount a disk on /var/log, the thing is, there are some process/services writing into it, such as openvpn, or system logs. Is there a way to mount a filesystem without having to restart the machine, or stopping the service? Many thanks
I would like to mount a disk on /var/log, the thing is, there are some process/services writing into it, such as openvpn, or system logs. Is there a way to mount a filesystem without having to restart the machine, or stopping the service? Many thanks
LinuxEnthusiast (1 rep)
Aug 10, 2020, 10:10 AM • Last activity: Aug 1, 2025, 11:02 PM
0 votes
1 answers
2844 views
Apache resource failed to start in Pacemaker
I am using Pacemaker with Corosync to set up a basic Apache HA cluster with 3 nodes running CentOS7. For some reasons, I cannot get the apache resource started in pcs. Cluster IP: 192.168.200.40 # pcs resource show ClusterIP Resource: ClusterIP (class=ocf provider=heartbeat type=IPaddr2) Attributes:...
I am using Pacemaker with Corosync to set up a basic Apache HA cluster with 3 nodes running CentOS7. For some reasons, I cannot get the apache resource started in pcs. Cluster IP: 192.168.200.40 # pcs resource show ClusterIP Resource: ClusterIP (class=ocf provider=heartbeat type=IPaddr2) Attributes: cidr_netmask=24 ip=192.168.200.40 Operations: monitor interval=20s (ClusterIP-monitor-interval-20s) start interval=0s timeout=20s (ClusterIP-start-interval-0s) stop interval=0s timeout=20s (ClusterIP-stop-interval-0s) # pcs resource show WebServer Resource: WebServer (class=ocf provider=heartbeat type=apache) Attributes: configfile=/etc/httpd/conf/httpd.conf statusurl=http://localhost/server-status Operations: monitor interval=1min (WebServer-monitor-interval-1min) start interval=0s timeout=40s (WebServer-start-interval-0s) stop interval=0s timeout=60s (WebServer-stop-interval-0s) # pcs status Cluster name: WARNING: corosync and pacemaker node names do not match (IPs used in setup?) Stack: corosync Current DC: server3.example.com (version 1.1.18-11.el7_5.2-2b07d5c5a9) - partition with quorum Last updated: Thu Jun 7 21:59:09 2018 Last change: Thu Jun 7 21:45:23 2018 by root via cibadmin on server1.example.com 3 nodes configured 2 resources configured Online: [ server1.example.com server2.example.com server3.example.com ] Full list of resources: ClusterIP (ocf::heartbeat:IPaddr2): Started server2.example.com WebServer (ocf::heartbeat:apache): Stopped Failed Actions: * WebServer_start_0 on server3.example.com 'unknown error' (1): call=49, status=Timed Out, exitreason='', last-rc-change='Thu Jun 7 21:46:03 2018', queued=0ms, exec=40002ms * WebServer_start_0 on server1.example.com 'unknown error' (1): call=53, status=Timed Out, exitreason='', last-rc-change='Thu Jun 7 21:45:23 2018', queued=0ms, exec=40003ms * WebServer_start_0 on server2.example.com 'unknown error' (1): call=47, status=Timed Out, exitreason='', last-rc-change='Thu Jun 7 21:46:43 2018', queued=1ms, exec=40002ms Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled The httpd instance is **enabled** and **running** on all three nodes. The cluster IP and individual node IPs are able to access the web page. The ClusterIP resource also works well for failover. What may go wrong for the apache resource in this case? Thank you very much! Update: Here is more information from the debug output. It seems the Apache is unable to bind to the port, but there is no error from the apache log, and systemctl status httpd gave all green on all nodes. I can open web pages via the cluster IP and each every node IP. The ClusterIP resource failover works fine, too. Any idea on why Apache resource doesn't work with pacemaker? # pcs resource debug-start WebServer --full Operation start for WebServer (ocf:heartbeat:apache) failed: 'Timed Out' (2) > stderr: ERROR: (98)Address already in use: AH00072: make_sock: could not bind to address [::]:80 (98)Address already in use: AH00072: make_sock: could not bind to address 0.0.0.0:80 no listening sockets available, shutting down AH00015: Unable to open logs > stderr: INFO: apache not running > stderr: INFO: waiting for apache /etc/httpd/conf/httpd.conf to come up > stderr: INFO: apache not running > stderr: INFO: waiting for apache /etc/httpd/conf/httpd.conf to come up > stderr: INFO: apache not running > stderr: INFO: waiting for apache /etc/httpd/conf/httpd.conf to come up > stderr: INFO: apache not running
cody (67 rep)
Jun 8, 2018, 04:16 PM • Last activity: Jul 15, 2025, 02:03 AM
1 votes
1 answers
2618 views
Pacemaker Virtual IP cannot be routed outside of its network
I have a server cluster consisted of following setup: 2 Virtual Servers with 2 NIC's. eth0 (private network 10.0.0.0/16) and eth1 (public network 77.1.2.0/24 with gateway as 77.1.2.1) For HA-01 VPS i have Private IP on eth0 set as 10.0.0.1 For HA-02 VPS i have Private IP set on eth0 as 10.0.0.2 Pace...
I have a server cluster consisted of following setup: 2 Virtual Servers with 2 NIC's. eth0 (private network 10.0.0.0/16) and eth1 (public network 77.1.2.0/24 with gateway as 77.1.2.1) For HA-01 VPS i have Private IP on eth0 set as 10.0.0.1 For HA-02 VPS i have Private IP set on eth0 as 10.0.0.2 Pacemaker/Corosync Cluster has been established between private IP addresses and Virtual IP (77.1.2.4) defined as clone Resource (IPAddr2) so it can float between two nodes. pcs resource create VirtualIP1 ocf:heartbeat:IPaddr2 ip="77.1.2.4" cidr_netmask="24" nic="eth1" clusterip_hash="sourceip-sourceport" op start interval="0s" timeout="60s" op monitor interval="1s" timeout="20s" op stop interval="0s" timeout="60s" clone interleave=true ordered=true Problem is, i cannot reach that IP address from world. I noticed that there is a route missing, so i add the static route ip r add default via 77.1.2.1 dev eth1 But i still cannot ping google.com from those servers nor world can see them on that IP. I also tried adding IP addresses from same subnet on eth1 like this: HA-01 eth1: 77.1.2.2 HA-02 eth1: 77.1.2.3 Servers can be seen on those IPs by world but if i add VirtualIP resource i cannot reach them on Virtual IP address. I also tried adding a source ip in routing table ip r add default via 77.1.2.1 src 77.1.2.4 to no avail. I don't know what am i supposed to do to get this VirtualIP working. I can reach 77.1.2.4 (Virtual IP Address) from other servers on that network, but not outside that network. Firewall is established and high availability ports are passed via command firewall-cmd --add-service="high availability"; firewall-cmd --add-service="high availability" --permanent Is there anything here that i am missing? If i add that address (77.1.2.4 - Virtual IP) alone on the interface of only one of those servers, it will work.... So is there an issue with ARP table perhaps or maybe router blocking some traffic?
Marko Todoric (437 rep)
Jul 19, 2019, 02:54 PM • Last activity: Apr 15, 2025, 03:08 AM
0 votes
2 answers
2897 views
RHEL High-Availability Cluster using pcs, configuring service as a resource
I have a 2 node cluster on RHEL 6.9. Everything is configured except I'm having difficulty with an application launched via shell script that created into a service (in `/etc/init.d/myApplication`), which I'll just call "myApp" . From that application, I did a `pcs resource create myApp lsb:myApp op...
I have a 2 node cluster on RHEL 6.9. Everything is configured except I'm having difficulty with an application launched via shell script that created into a service (in /etc/init.d/myApplication), which I'll just call "myApp". From that application, I did a pcs resource create myApp lsb:myApp op monitor interval=30s op start on-fail=standby. I am new to using this suite of software but it's for work. What I need is for this application to be launched on both nodes simultaneously as it has to be started manually so if the first node fails, it would need intervention if it were not already active on the passive node. I have two other services: -VirtIP (ocf:heartbeat:IPaddr2) for providing a service IP for the application server -Cron (lsb:crond) to synchronize the application files (we are not using shared storage) I have the VirtIP and Cron as dependents via colocation to myApp. I've tried master/slave as well as cloning but I must be missing something regarding their config. If I take the application offline, pacemaker does not detect the service has gone down and pcs status outputs that myApp is still running on the node (or nodes depending on my config). I'm also sometimes getting the issue that the service running the app is stopped by pacemaker on the passive node. Which is the way I need to configure this? I've gone through the RHEL documentation but I'm still stuck. How do I get pacemaker to initiate failover if myApp service goes down? I don't know why it's not detecting the service has stopped in some cases. EDIT: So for testing purposes, I removed the password requirement for starting/restarting and the service starts/restarts fine as expected and the colocation dependent resources stop/start as expected. But stopping the myApp service does not reflect as a stopped resource but simply stays at Started node1. Likewise, simulating a failover via putting node1 into standby simply stops all resources on node1.
Greg (187 rep)
Sep 29, 2017, 07:52 AM • Last activity: Sep 6, 2023, 09:56 PM
-1 votes
1 answers
717 views
IBM AIX - Method to identify Cluster or HA services
I am keen to learn if existing IBM AIX servers from different location have Clustering/HA features. Kindly let me know the steps to check. Thanks.
I am keen to learn if existing IBM AIX servers from different location have Clustering/HA features. Kindly let me know the steps to check. Thanks.
Nick eric adelee (49 rep)
Dec 5, 2022, 09:01 AM • Last activity: Dec 5, 2022, 06:36 PM
0 votes
0 answers
137 views
Options for high-availablility, high-throughput bonding in Linux
When trying to configure high-availablity (HA) bonding in Linux that should also use the bandwidth available, I wonder what the options are: The solution should ensure HA and optimal throughput (when all links are up) in a (simplified) scenario like this: [![Example Scenario for HA-Bonding][1]][1] S...
When trying to configure high-availablity (HA) bonding in Linux that should also use the bandwidth available, I wonder what the options are: The solution should ensure HA and optimal throughput (when all links are up) in a (simplified) scenario like this: Example Scenario for HA-Bonding So for example host **H1** has two interfaces **1** and **2**, also denoted as **H1.1** and **H1.2**. Starting with a standard configuration like active-backup with miimon link monitoring there are these problems: - Only one interface is used at a time - If **S1.3** fails, both **H1.1** and **H1.2** will see a valid link, but **H1.1** could not reach **H2** then So the first step was to use arp_ip_target for link monitoring to detect a possible inter-switch link (ISL) failure. But still the problem is that only one of both host interfaces can be used at a time. Do I tried to use balance-tlb instead of active-backup. However it seems balance-tlb does not allow to use arp_ip_target for link monitoring. So I wonder: Is there a solution that provides both, high availability in case of any link failures *and* high bandwidth. Final note: === Conceptually **S1** and **S3** would be connected, too (just as **S2** and **S4**), but for illustration the example is simple enough, I hope. Also I can configure the hosts, but not the switches.
U. Windl (1715 rep)
Nov 13, 2020, 08:04 AM • Last activity: Sep 8, 2022, 09:57 AM
0 votes
0 answers
1408 views
High CPU utilization showing in CPU
We are running Two Node cluster using Redhat Pacemaker running on RHEL 7. Last thursday (3/2/2022) i updated kernel to latest version. And on Friday at 3:49 First node rebooted(Reason unknow) and then rejoined but at time resources were running on Node2. Today i noticed that are is high cpu utilizat...
We are running Two Node cluster using Redhat Pacemaker running on RHEL 7. Last thursday (3/2/2022) i updated kernel to latest version. And on Friday at 3:49 First node rebooted(Reason unknow) and then rejoined but at time resources were running on Node2. Today i noticed that are is high cpu utilization and top command shows %Cpu(s): 2.9 us, 89.8 sy, 0.2 ni, 7.1 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st I dont know what process is using 89.8% of cpu top command =========== top - 07:10:55 up 4 days, 14:17, 2 users, load average: 8.08, 8.13, 7.98 Tasks: 483 total, 8 running, 475 sleeping, 0 stopped, 0 zombie %Cpu(s): 2.7 us, 89.7 sy, 0.2 ni, 7.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 39464316+total, 1881036 free, 21074576+used, 18201638+buff/cache KiB Swap: 93749248 total, 93749248 free, 0 used. 18109798+avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 183327 oracle 20 0 195.5g 56260 42096 R 99.3 0.0 1133:15 oracle_183327_s 183552 oracle 20 0 195.5g 58704 42032 R 99.3 0.0 5626:11 oracle_183552_s 183443 oracle 20 0 195.5g 54488 40728 R 98.8 0.0 1626:54 oracle_183443_s 183554 oracle 20 0 195.5g 57076 41912 R 98.6 0.0 5304:10 oracle_183554_s 183354 oracle 20 0 195.5g 47248 39176 R 97.8 0.0 4847:28 oracle_183354_s 183556 oracle 20 0 195.5g 60040 43456 R 97.8 0.0 2486:30 oracle_183556_s 104734 oracle 20 0 195.5g 48516 39564 R 97.1 0.0 1583:06 oracle_104734_s 142910 root 20 0 162524 2704 1588 R 27.5 0.0 1:43.11 top 4612 root 39 19 13172 9268 480 S 3.8 0.0 255:12.73 apps.plugin 3918 netdata 39 19 251412 137752 2760 S 1.0 0.0 92:52.38 netdata 175736 root 20 0 755216 74556 13932 S 1.0 0.0 64:27.23 guard_stap 183545 oracle 20 0 195.5g 61516 42944 S 1.0 0.0 50:46.33 oracle_183545_s 165271 oracle -2 0 195.4g 18936 15872 S 0.7 0.0 44:31.74 ora_vktm_ssys 183352 oracle 20 0 195.5g 45884 38572 S 0.7 0.0 35:28.20 oracle_183352_s 183550 oracle 20 0 195.5g 52640 42520 S 0.7 0.0 47:01.94 oracle_183550_s 189069 oracle 20 0 195.5g 58344 41844 S 0.7 0.0 38:45.02 oracle_189069_s 3695 root 20 0 916256 131244 18368 S 0.5 0.0 42:22.64 ds_agent 3721 root rt 0 196440 98180 70968 S 0.5 0.0 42:39.31 corosync 69846 oracle 20 0 195.5g 49116 39316 S 0.5 0.0 10:22.26 oracle_69846_ss 183350 oracle 20 0 195.5g 45672 38332 S 0.5 0.0 36:46.71 oracle_183350_s 183356 oracle 20 0 195.5g 45992 38452 S 0.5 0.0 36:24.67 oracle_183356_s 183787 oracle 20 0 195.5g 45428 37976 S 0.5 0.0 2:10.28 oracle_183787_s 198328 oracle 20 0 195.5g 52616 42012 S 0.5 0.0 38:30.80 oracle_198328_s 1471 root 20 0 0 0 0 S 0.2 0.0 0:14.07 MpxPeriodicCall 3822 root 20 0 138468 9392 5696 S 0.2 0.0 4:15.32 stonithd 3962 swiagent 20 0 2342756 14444 6624 S 0.2 0.0 5:07.94 swiagent 4607 netdata 39 19 161488 21948 4312 S 0.2 0.0 16:25.79 python 22089 oracle 20 0 195.4g 26088 21472 S 0.2 0.0 0:01.55 ora_m006_ssys 114147 oracle 20 0 195.4g 41528 34804 S 0.2 0.0 0:00.43 oracle_114147_s 117437 oracle 20 0 195.5g 45332 38108 S 0.2 0.0 5:33.05 oracle_117437_s 135186 root 20 0 3706316 163948 31820 S 0.2 0.0 18:33.37 ds_am 148697 netdata 39 19 1648 1008 616 S 0.2 0.0 0:00.20 bash 152754 root 20 0 477760 4984 3960 S 0.2 0.0 0:00.01 SolarWinds.ADM. 165327 oracle 20 0 195.5g 81356 51292 S 0.2 0.0 1:49.24 ora_mmon_ssys 183783 oracle 20 0 195.5g 44960 37616 S 0.2 0.0 2:12.17 oracle_183783_s 1 root 20 0 191832 4924 2660 S 0.0 0.0 44:03.90 systemd 2 root 20 0 0 0 0 S 0.0 0.0 0:00.13 kthreadd 4 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H 6 root 20 0 0 0 0 S 0.0 0.0 0:26.87 ksoftirqd/0 7 root rt 0 0 0 0 S 0.0 0.0 0:03.91 migration/0 8 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_bh 9 root 20 0 0 0 0 S 0.0 0.0 10:48.66 rcu_sched 10 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 lru-add-drain 11 root rt 0 0 0 0 S 0.0 0.0 0:00.62 watchdog/0 This CPU utilization is increasing since Friday 9AM and was gradually increasing SAR command (Friday) ==================== sudo sar -u ALL -f /var/log/sa/sa04 Linux 3.10.0-1160.53.1.el7.x86_64 (prod-db2-node2) 02/04/2022 _x86_64_ (8 CPU) 12:00:01 AM CPU %usr %nice %sys %iowait %steal %irq %soft %guest %gnice %idle 12:10:01 AM all 3.54 0.31 2.99 0.04 0.00 0.00 0.02 0.00 0.00 93.10 12:20:01 AM all 3.56 0.31 3.00 0.03 0.00 0.00 0.02 0.00 0.00 93.08 12:30:01 AM all 3.55 0.31 3.04 0.03 0.00 0.00 0.02 0.00 0.00 93.04 12:40:01 AM all 3.62 0.31 3.06 0.03 0.00 0.00 0.02 0.00 0.00 92.96 12:50:01 AM all 3.53 0.31 3.34 0.04 0.00 0.00 0.02 0.00 0.00 92.76 01:00:01 AM all 3.74 0.31 3.08 0.04 0.00 0.00 0.02 0.00 0.00 92.81 01:10:01 AM all 3.88 0.31 3.07 0.08 0.00 0.00 0.03 0.00 0.00 92.64 01:20:01 AM all 3.54 0.31 3.02 0.03 0.00 0.00 0.02 0.00 0.00 93.08 01:30:01 AM all 3.56 0.31 3.04 0.03 0.00 0.00 0.03 0.00 0.00 93.03 01:40:01 AM all 3.55 0.30 3.03 0.03 0.00 0.00 0.02 0.00 0.00 93.07 01:50:01 AM all 3.61 0.31 3.04 0.03 0.00 0.00 0.02 0.00 0.00 92.99 02:00:01 AM all 3.55 0.31 3.05 0.03 0.00 0.00 0.02 0.00 0.00 93.03 02:10:01 AM all 3.60 0.31 3.04 0.04 0.00 0.00 0.02 0.00 0.00 92.99 02:20:01 AM all 3.52 0.31 3.01 0.03 0.00 0.00 0.02 0.00 0.00 93.10 02:30:01 AM all 3.75 0.31 3.05 0.03 0.00 0.00 0.02 0.00 0.00 92.83 02:40:01 AM all 3.52 0.31 3.01 0.03 0.00 0.00 0.02 0.00 0.00 93.11 02:50:01 AM all 3.57 0.31 3.02 0.03 0.00 0.00 0.02 0.00 0.00 93.04 03:00:01 AM all 3.55 0.30 3.02 0.03 0.00 0.00 0.02 0.00 0.00 93.08 03:10:01 AM all 3.59 0.31 3.03 0.04 0.00 0.00 0.02 0.00 0.00 93.00 03:20:01 AM all 3.58 0.31 3.04 0.04 0.00 0.00 0.02 0.00 0.00 93.02 03:30:01 AM all 3.51 0.31 2.99 0.03 0.00 0.00 0.02 0.00 0.00 93.13 03:40:01 AM all 3.57 0.31 3.02 0.03 0.00 0.00 0.02 0.00 0.00 93.05 03:50:01 AM all 3.55 0.34 3.10 0.20 0.00 0.00 0.02 0.00 0.00 92.79 04:00:01 AM all 3.71 0.31 3.04 0.03 0.00 0.00 0.02 0.00 0.00 92.89 04:10:01 AM all 3.54 0.31 3.01 0.03 0.00 0.00 0.02 0.00 0.00 93.08 04:20:01 AM all 3.53 0.31 3.02 0.03 0.00 0.00 0.02 0.00 0.00 93.08 04:30:01 AM all 3.51 0.31 3.01 0.03 0.00 0.00 0.02 0.00 0.00 93.12 04:40:01 AM all 3.57 0.31 3.03 0.03 0.00 0.00 0.03 0.00 0.00 93.03 04:50:01 AM all 3.45 0.31 3.19 0.03 0.00 0.00 0.03 0.00 0.00 93.00 05:00:01 AM all 3.57 0.31 3.05 0.03 0.00 0.00 0.02 0.00 0.00 93.01 05:10:02 AM all 3.56 0.31 3.07 0.03 0.00 0.00 0.02 0.00 0.00 93.00 05:20:01 AM all 3.54 0.31 3.09 0.03 0.00 0.00 0.02 0.00 0.00 93.01 05:30:01 AM all 3.72 0.31 3.08 0.03 0.00 0.00 0.02 0.00 0.00 92.83 05:40:01 AM all 3.54 0.31 3.05 0.03 0.00 0.00 0.02 0.00 0.00 93.05 05:50:01 AM all 3.53 0.31 3.03 0.03 0.00 0.00 0.02 0.00 0.00 93.08 06:00:01 AM all 3.53 0.31 3.03 0.03 0.00 0.00 0.03 0.00 0.00 93.08 06:10:01 AM all 3.61 0.31 3.06 0.03 0.00 0.00 0.02 0.00 0.00 92.97 06:20:01 AM all 3.50 0.31 3.01 0.03 0.00 0.00 0.02 0.00 0.00 93.13 06:20:01 AM CPU %usr %nice %sys %iowait %steal %irq %soft %guest %gnice %idle 06:30:01 AM all 3.58 0.31 3.09 0.03 0.00 0.00 0.03 0.00 0.00 92.97 06:40:01 AM all 3.56 0.31 3.06 0.03 0.00 0.00 0.03 0.00 0.00 93.01 06:50:01 AM all 3.56 0.31 3.07 0.03 0.00 0.00 0.03 0.00 0.00 93.00 07:00:02 AM all 3.70 0.31 3.09 0.03 0.00 0.00 0.03 0.00 0.00 92.85 07:10:01 AM all 3.61 0.31 3.09 0.03 0.00 0.00 0.03 0.00 0.00 92.93 07:20:02 AM all 3.50 0.31 3.07 0.03 0.00 0.00 0.02 0.00 0.00 93.07 07:30:01 AM all 3.59 0.30 3.08 0.03 0.00 0.00 0.03 0.00 0.00 92.97 07:40:01 AM all 3.58 0.31 3.09 0.03 0.00 0.00 0.03 0.00 0.00 92.96 07:50:01 AM all 3.54 0.31 3.06 0.03 0.00 0.00 0.02 0.00 0.00 93.04 08:00:01 AM all 3.55 0.31 3.26 0.03 0.00 0.00 0.02 0.00 0.00 92.82 08:10:01 AM all 3.57 0.31 3.07 0.03 0.00 0.00 0.02 0.00 0.00 93.00 08:20:01 AM all 3.55 0.31 3.08 0.03 0.00 0.00 0.02 0.00 0.00 93.01 08:30:01 AM all 3.69 0.31 3.11 0.03 0.00 0.00 0.02 0.00 0.00 92.84 08:40:01 AM all 3.62 0.31 3.11 0.03 0.00 0.00 0.02 0.00 0.00 92.91 08:50:01 AM all 3.52 0.31 3.06 0.03 0.00 0.00 0.02 0.00 0.00 93.06 09:00:01 AM all 3.28 0.29 15.20 0.03 0.00 0.00 0.04 0.00 0.00 81.16 09:10:01 AM all 3.28 0.29 15.30 0.03 0.00 0.00 0.04 0.00 0.00 81.07 09:20:01 AM all 3.26 0.29 15.33 0.03 0.00 0.00 0.04 0.00 0.00 81.06 09:30:01 AM all 3.23 0.29 15.30 0.03 0.00 0.00 0.04 0.00 0.00 81.12 09:40:01 AM all 3.30 0.28 15.32 0.03 0.00 0.00 0.04 0.00 0.00 81.03 09:50:01 AM all 3.26 0.28 15.29 0.03 0.00 0.00 0.04 0.00 0.00 81.10 10:00:01 AM all 3.38 0.28 15.37 0.03 0.00 0.00 0.04 0.00 0.00 80.90 10:10:01 AM all 3.31 0.28 15.33 0.04 0.00 0.00 0.04 0.00 0.00 81.01 10:20:01 AM all 3.23 0.29 15.33 0.03 0.00 0.00 0.04 0.00 0.00 81.09 10:30:01 AM all 3.28 0.28 15.33 0.03 0.00 0.00 0.04 0.00 0.00 81.04 10:40:01 AM all 3.25 0.29 15.31 0.03 0.00 0.00 0.04 0.00 0.00 81.09 10:50:01 AM all 3.27 0.28 15.33 0.03 0.00 0.00 0.04 0.00 0.00 81.05 11:00:01 AM all 3.21 0.28 15.32 0.03 0.00 0.00 0.04 0.00 0.00 81.12 11:10:01 AM all 3.33 0.29 15.35 0.03 0.00 0.00 0.04 0.00 0.00 80.96 11:20:01 AM all 3.26 0.28 15.32 0.03 0.00 0.00 0.04 0.00 0.00 81.06 11:30:01 AM all 3.44 0.28 15.36 0.03 0.00 0.00 0.04 0.00 0.00 80.85 11:40:01 AM all 3.26 0.29 15.32 0.03 0.00 0.00 0.03 0.00 0.00 81.07 11:50:01 AM all 3.29 0.29 15.33 0.03 0.00 0.00 0.04 0.00 0.00 81.02 12:00:01 PM all 3.29 0.28 15.33 0.03 0.00 0.00 0.04 0.00 0.00 81.02 12:10:01 PM all 3.29 0.29 15.35 0.03 0.00 0.00 0.04 0.00 0.00 81.01 12:20:01 PM all 3.27 0.28 15.35 0.03 0.00 0.00 0.04 0.00 0.00 81.02 12:30:01 PM all 3.25 0.29 15.34 0.03 0.00 0.00 0.04 0.00 0.00 81.06 12:40:01 PM all 3.30 0.28 15.35 0.03 0.00 0.00 0.03 0.00 0.00 80.99 12:40:01 PM CPU %usr %nice %sys %iowait %steal %irq %soft %guest %gnice %idle 12:50:01 PM all 3.25 0.28 15.34 0.03 0.00 0.00 0.04 0.00 0.00 81.06 01:00:01 PM all 3.46 0.29 15.40 0.03 0.00 0.00 0.04 0.00 0.00 80.79 01:10:01 PM all 3.25 0.29 15.34 0.03 0.00 0.00 0.04 0.00 0.00 81.05 01:20:01 PM all 3.30 0.28 15.38 0.03 0.00 0.00 0.04 0.00 0.00 80.98 01:30:01 PM all 3.26 0.28 15.36 0.04 0.00 0.00 0.04 0.00 0.00 81.03 01:40:01 PM all 3.61 0.29 15.41 0.18 0.00 0.00 0.04 0.00 0.00 80.47 01:50:01 PM all 3.24 0.28 15.38 0.03 0.00 0.00 0.04 0.00 0.00 81.03 02:00:01 PM all 3.29 0.28 15.39 0.03 0.00 0.00 0.04 0.00 0.00 80.97 02:10:01 PM all 3.30 0.28 15.38 0.04 0.00 0.00 0.04 0.00 0.00 80.96 02:20:01 PM all 3.14 0.28 20.19 0.03 0.00 0.00 0.04 0.00 0.00 76.32 02:30:02 PM all 3.22 0.28 27.71 0.03 0.00 0.00 0.04 0.00 0.00 68.73 02:40:01 PM all 3.00 0.28 27.66 0.03 0.00 0.00 0.04 0.00 0.00 68.99 02:50:01 PM all 3.06 0.28 27.65 0.03 0.00 0.00 0.04 0.00 0.00 68.94 03:00:01 PM all 3.00 0.28 27.68 0.03 0.00 0.00 0.03 0.00 0.00 68.97 03:10:02 PM all 3.28 0.27 27.70 0.05 0.00 0.00 0.04 0.00 0.00 68.66 03:20:01 PM all 2.99 0.28 27.66 0.03 0.00 0.00 0.04 0.00 0.00 69.00 03:30:01 PM all 3.07 0.28 27.68 0.03 0.00 0.00 0.04 0.00 0.00 68.90 03:40:01 PM all 3.04 0.28 27.67 0.03 0.00 0.00 0.04 0.00 0.00 68.94 03:50:01 PM all 3.04 0.27 27.69 0.03 0.00 0.00 0.04 0.00 0.00 68.93 04:00:01 PM all 3.19 0.28 27.71 0.03 0.00 0.00 0.04 0.00 0.00 68.76 04:10:01 PM all 3.09 0.28 28.14 0.03 0.00 0.00 0.04 0.00 0.00 68.42 04:20:01 PM all 3.04 0.28 27.69 0.03 0.00 0.00 0.03 0.00 0.00 68.92 04:30:01 PM all 3.04 0.28 27.68 0.03 0.00 0.00 0.04 0.00 0.00 68.94 04:40:01 PM all 3.08 0.28 27.72 0.03 0.00 0.00 0.03 0.00 0.00 68.85 04:50:01 PM all 3.01 0.28 27.70 0.03 0.00 0.00 0.04 0.00 0.00 68.95 05:00:01 PM all 3.05 0.28 27.68 0.03 0.00 0.00 0.04 0.00 0.00 68.92 05:10:01 PM all 5.55 0.26 32.05 6.84 0.00 0.00 0.12 0.00 0.00 55.17 05:20:01 PM all 3.05 0.28 27.71 0.03 0.00 0.00 0.03 0.00 0.00 68.89 05:30:01 PM all 3.19 0.28 27.73 0.03 0.00 0.00 0.03 0.00 0.00 68.73 05:40:01 PM all 3.05 0.28 27.70 0.03 0.00 0.00 0.04 0.00 0.00 68.91 05:50:01 PM all 3.03 0.28 27.69 0.03 0.00 0.00 0.04 0.00 0.00 68.93 06:00:01 PM all 3.03 0.28 27.72 0.03 0.00 0.00 0.04 0.00 0.00 68.91 06:10:02 PM all 3.06 0.28 27.72 0.03 0.00 0.00 0.04 0.00 0.00 68.88 06:20:01 PM all 3.07 0.28 27.72 0.03 0.00 0.00 0.03 0.00 0.00 68.87 06:30:01 PM all 3.09 0.28 27.77 0.56 0.00 0.00 0.04 0.00 0.00 68.26 06:40:01 PM all 3.05 0.28 27.74 0.03 0.00 0.00 0.04 0.00 0.00 68.86 06:50:01 PM all 3.07 0.28 27.71 0.03 0.00 0.00 0.04 0.00 0.00 68.87 07:00:01 PM all 3.19 0.28 27.75 0.03 0.00 0.00 0.04 0.00 0.00 68.71 07:00:01 PM CPU %usr %nice %sys %iowait %steal %irq %soft %guest %gnice %idle 07:10:01 PM all 3.14 0.27 27.76 0.03 0.00 0.00 0.03 0.00 0.00 68.76 07:20:01 PM all 3.03 0.28 27.72 0.03 0.00 0.00 0.04 0.00 0.00 68.90 07:30:01 PM all 3.08 0.28 27.73 0.03 0.00 0.00 0.04 0.00 0.00 68.84 07:40:01 PM all 3.06 0.28 27.73 0.03 0.00 0.00 0.04 0.00 0.00 68.87 07:50:01 PM all 3.05 0.28 27.73 0.03 0.00 0.00 0.04 0.00 0.00 68.87 08:00:01 PM all 3.03 0.27 27.74 0.03 0.00 0.00 0.03 0.00 0.00 68.89 08:10:01 PM all 3.10 0.27 27.76 0.03 0.00 0.00 0.04 0.00 0.00 68.79 08:20:01 PM all 3.03 0.28 27.73 0.03 0.00 0.00 0.04 0.00 0.00 68.90 08:30:01 PM all 3.23 0.28 27.77 0.03 0.00 0.00 0.03 0.00 0.00 68.66 08:40:01 PM all 3.08 0.28 27.75 0.03 0.00 0.00 0.03 0.00 0.00 68.82 08:50:01 PM all 3.04 0.28 27.74 0.03 0.00 0.00 0.04 0.00 0.00 68.88 09:00:01 PM all 3.08 0.28 27.76 0.03 0.00 0.00 0.04 0.00 0.00 68.81 09:10:01 PM all 3.07 0.28 27.77 0.03 0.00 0.00 0.04 0.00 0.00 68.81 09:20:01 PM all 3.07 0.28 27.76 0.03 0.00 0.00 0.04 0.00 0.00 68.81 09:30:01 PM all 3.04 0.28 27.74 0.03 0.00 0.00 0.04 0.00 0.00 68.87 09:40:01 PM all 3.09 0.28 27.77 0.03 0.00 0.00 0.04 0.00 0.00 68.79 09:50:01 PM all 3.04 0.28 27.77 0.03 0.00 0.00 0.03 0.00 0.00 68.85 10:00:01 PM all 3.21 0.26 36.38 0.04 0.00 0.00 0.03 0.00 0.00 60.08 10:10:01 PM all 7.59 0.25 40.00 0.15 0.00 0.00 0.04 0.00 0.00 51.98 10:20:01 PM all 2.98 0.26 40.02 0.03 0.00 0.00 0.03 0.00 0.00 56.68 10:30:01 PM all 2.98 0.25 40.02 0.04 0.00 0.00 0.03 0.00 0.00 56.67 10:40:01 PM all 3.00 0.25 40.03 0.03 0.00 0.00 0.03 0.00 0.00 56.65 10:50:01 PM all 2.97 0.26 40.05 0.03 0.00 0.00 0.03 0.00 0.00 56.65 11:00:01 PM all 2.92 0.26 40.03 0.03 0.00 0.00 0.04 0.00 0.00 56.72 11:10:01 PM all 3.03 0.25 40.08 0.03 0.00 0.00 0.03 0.00 0.00 56.57 11:20:01 PM all 2.95 0.26 40.03 0.03 0.00 0.00 0.03 0.00 0.00 56.70 11:30:01 PM all 3.14 0.26 40.06 0.03 0.00 0.00 0.03 0.00 0.00 56.47 11:40:01 PM all 2.97 0.26 40.05 0.03 0.00 0.00 0.03 0.00 0.00 56.67 11:50:01 PM all 2.99 0.26 40.06 0.03 0.00 0.00 0.03 0.00 0.00 56.63 Average: all 3.36 0.29 16.80 0.09 0.00 0.00 0.03 0.00 0.00 79.43 It started increase exactly at nine AM on Friday SAR command (Latest) ==================== sudo sar -u Linux 3.10.0-1160.53.1.el7.x86_64 (prod-db2-node2) 02/08/2022 _x86_64_ (8 CPU) 12:00:01 AM CPU %user %nice %system %iowait %steal %idle 12:10:02 AM all 1.54 0.21 88.00 0.02 0.00 10.23 12:20:01 AM all 1.50 0.22 87.99 0.01 0.00 10.28 12:30:01 AM all 1.47 0.21 87.97 0.01 0.00 10.34 12:40:01 AM all 1.48 0.22 87.98 0.01 0.00 10.31 12:50:01 AM all 1.47 0.21 88.00 0.01 0.00 10.30 01:00:01 AM all 1.70 0.22 87.98 0.01 0.00 10.10 01:10:01 AM all 1.93 0.21 87.94 0.02 0.00 9.90 01:20:01 AM all 1.50 0.22 88.00 0.01 0.00 10.27 01:30:02 AM all 1.51 0.21 87.97 0.01 0.00 10.29 01:40:01 AM all 1.51 0.21 87.95 0.01 0.00 10.32 01:50:01 AM all 1.46 0.21 87.96 0.02 0.00 10.35 02:00:02 AM all 1.49 0.22 87.95 0.01 0.00 10.32 02:10:02 AM all 1.53 0.22 87.93 0.01 0.00 10.31 02:20:02 AM all 1.44 0.22 87.95 0.01 0.00 10.38 02:30:01 AM all 1.70 0.21 87.94 0.02 0.00 10.13 02:40:01 AM all 1.44 0.21 87.95 0.02 0.00 10.38 02:50:01 AM all 1.47 0.21 87.97 0.01 0.00 10.34 03:00:02 AM all 1.43 0.21 87.94 0.01 0.00 10.40 03:10:01 AM all 1.50 0.21 87.96 0.01 0.00 10.31 03:20:01 AM all 1.51 0.23 87.97 0.01 0.00 10.28 03:30:02 AM all 1.48 0.21 87.93 0.01 0.00 10.36 03:40:02 AM all 1.47 0.22 87.94 0.02 0.00 10.35 03:50:01 AM all 1.44 0.22 87.95 0.01 0.00 10.38 04:00:01 AM all 1.64 0.21 87.94 0.02 0.00 10.19 04:10:01 AM all 1.52 0.22 87.92 0.02 0.00 10.33 04:20:01 AM all 1.45 0.22 87.92 0.02 0.00 10.40 04:30:02 AM all 1.43 0.21 87.95 0.02 0.00 10.39 04:40:02 AM all 1.48 0.22 87.95 0.02 0.00 10.33 04:50:01 AM all 1.41 0.22 87.97 0.02 0.00 10.39 05:00:01 AM all 1.48 0.22 87.94 0.02 0.00 10.35 05:10:01 AM all 1.53 0.21 87.95 0.02 0.00 10.29 05:20:01 AM all 1.45 0.22 87.96 0.01 0.00 10.36 05:30:02 AM all 1.65 0.21 87.92 0.01 0.00 10.20 05:40:01 AM all 1.49 0.21 87.94 0.01 0.00 10.35 05:50:01 AM all 1.43 0.21 87.95 0.01 0.00 10.40 06:00:01 AM all 1.47 0.21 87.93 0.01 0.00 10.38 06:10:01 AM all 1.50 0.22 87.94 0.01 0.00 10.34 06:20:01 AM all 1.44 0.22 87.96 0.01 0.00 10.38 06:30:01 AM all 1.47 0.21 87.93 0.01 0.00 10.37 06:40:01 AM all 1.43 0.21 87.94 0.01 0.00 10.40 06:50:01 AM all 1.44 0.22 87.94 0.01 0.00 10.39 07:00:01 AM all 1.75 0.22 88.04 0.01 0.00 9.98 07:10:01 AM all 2.27 0.21 88.86 0.01 0.00 8.65 Average: all 1.53 0.21 87.98 0.01 0.00 10.27 VMSTAT Command ============== vmstat 1 -w procs -----------------------memory---------------------- ---swap-- -----io---- -system-- --------cpu-------- r b swpd free buff cache si so bi bo in cs us sy id wa st 7 0 0 4380564 827180 179070080 0 0 231 41 11 9 3 48 48 0 0 7 0 0 4364796 827180 179070080 0 0 0 160 9274 3727 1 88 10 0 0 7 0 0 4359876 827184 179070080 0 0 0 176 9180 3915 1 88 11 0 0 7 0 0 4359372 827184 179070096 0 0 1664 36 9201 3607 1 88 11 0 0 7 0 0 4351796 827184 179070112 0 0 6656 156 9392 4170 1 89 10 0 0 7 0 0 4361172 827184 179070096 0 0 1664 208 9352 4380 1 89 10 0 0 7 0 0 4360752 827184 179070096 0 0 0 48 9179 3496 0 88 12 0 0 7 0 0 4362452 827184 179070096 0 0 0 12 9281 4572 1 89 9 0 0 7 0 0 4363568 827184 179070096 0 0 0 124 9197 3497 0 88 12 0 0 8 0 0 4364952 827184 179070096 0 0 0 140 9189 3682 0 88 11 0 0 7 0 0 4364640 827184 179070096 0 0 0 88 9195 3556 0 88 12 0 0 I checked for the logs and there nothing i could find that could cause the high cpu utilization Now i know that TOP command is showing processes are related to Oracle DB. But Category shows that 89.8 in SystemSpace and not in UserSpace. Any advice on how to get what caused this spike Thanks
OmiPenguin (4398 rep)
Feb 8, 2022, 04:52 AM • Last activity: Feb 9, 2022, 05:12 AM
0 votes
0 answers
459 views
HA-Cluster / corosync / pacemaker: Active-Active cluster with service ip / service ip is not switching
How to configure crm to migrate the ServiceIP if one Service is failed? node 1: web01a \ attributes standby=off node 2: web01b \ attributes standby=off primitive Apache2 systemd:apache2 \ operations $id=Apache2-operations \ op start interval=0 timeout=100 \ op stop interval=0 timeout=100 \ op monito...
How to configure crm to migrate the ServiceIP if one Service is failed? node 1: web01a \ attributes standby=off node 2: web01b \ attributes standby=off primitive Apache2 systemd:apache2 \ operations $id=Apache2-operations \ op start interval=0 timeout=100 \ op stop interval=0 timeout=100 \ op monitor interval=15 timeout=100 start-delay=15 \ meta primitive PHP-FPM systemd:php7.4-fpm \ operations $id=PHP-FPM-operations \ op start interval=0 timeout=100 \ op stop interval=0 timeout=100 \ op monitor interval=15 timeout=100 start-delay=15 \ meta primitive Redis systemd:redis-server \ operations $id=Redis-operations \ op start interval=0 timeout=100 \ op stop interval=0 timeout=100 \ op monitor interval=15 timeout=100 start-delay=15 \ meta primitive ServiceIP IPaddr2 \ params ip=1.2.3.4 \ operations $id=ServiceIP-operations \ op monitor interval=10 timeout=20 start-delay=0 \ op_params migration-threshold=1 \ meta primitive lsyncd systemd:lsyncd \ op start interval=0 timeout=100 \ op stop interval=0 timeout=100 \ op monitor interval=15 timeout=100 start-delay=15 \ meta target-role=Started group ActiveNode ServiceIP lsyncd group WebServer Apache2 PHP-FPM Redis clone cl_WS WebServer \ meta clone-max=2 notify=true interleave=true colocation col_cl_WS_ActiveNode 100: cl_WS ActiveNode property cib-bootstrap-options: \ have-watchdog=false \ dc-version=2.0.3-4b1f869f0f \ cluster-infrastructure=corosync \ cluster-name=debian \ stonith-enabled=false \ no-quorum-policy=ignore \ startup-fencing=false \ maintenance-mode=false \ last-lrm-refresh=1622628525 \ start-failure-is-fatal=true These services should always be started - Apache2 - PHP-FPM - Redis If one of these services is not running, the node is unhelthy. The **ServiceIP** and **lsyncd** should switch to an healthy node. When I killed the apache2 process, the IP is not switched.
FaxMax (726 rep)
Jun 2, 2021, 12:29 PM
1 votes
0 answers
143 views
Stop a pacemaker node when local shell script returns an error?
Is it possible to make pacemaker stopping a node when a local test script fails, and start a node if the local test script returns true again? This seems like a very simple problem, but as i can't find ANY way to do this within pacemaker, I'm about to run the following shell script on all my nodes:...
Is it possible to make pacemaker stopping a node when a local test script fails, and start a node if the local test script returns true again? This seems like a very simple problem, but as i can't find ANY way to do this within pacemaker, I'm about to run the following shell script on all my nodes: while true; do pcs status 2>/dev/null >/dev/null && node_running=true /is_node_healthy.sh && node_healthy=true [[ -v node_running ]] && ! [[ -v node_healthy ]] && pcs cluster stop [[ -v node_healthy ]] && ! [[ -v node_running ]] && pcs cluster start unset node_running node_healthy sleep 10 done This does exactly what i want, but looks like a very dirty hack in my eyes. Is there a more elegant way to get the same thing done by pacemaker itself? BTW: The overall task i want to solve seems quite simple: create a ha cluster that has a public ip address assigned to a vital host, where vitality can be checked with /is_node_healthy.sh
psicolor (11 rep)
Feb 22, 2021, 11:54 AM
1 votes
1 answers
393 views
fence_virtualbox failed to reboot
I’m learning how to fence pacemaker using fence_virtualbox from [\[ClusterLabs\] Fence agent for VirtualBox][1], but I can’t get it working. When I try to run `stonith_admin –-reboot ` it failed. Currently, my setup is: Node ID: VM name: orcllinux1 OL7 orcllinux2 OL7_2 I set it up using: `pcs stonit...
I’m learning how to fence pacemaker using fence_virtualbox from [\[ClusterLabs\] Fence agent for VirtualBox][1] , but I can’t get it working. When I try to run stonith_admin –-reboot it failed. Currently, my setup is: Node ID: VM name: orcllinux1 OL7 orcllinux2 OL7_2 I set it up using: pcs stonith create fence_vbox fence_virtualbox pcmk_host_map=”orcllinux1:OL7,orcllinux2:OL7_2” pcmk_host_list=”orcllinux1,orcllinux2” pcmk_host_check=static_list ipaddr=”192.168.57.1” login=”root” But stonith_admin –-reboot resulting in this error: Error I tried to use the fence_virtualbox manually using: fence_virtualbox -s 192.168.57.1 -p OL7 -o=reboot and it succeeded. Is my stonith create syntax wrong? What's the right syntax if it's wrong?
Christophorus Reyhan (33 rep)
Jan 8, 2021, 11:16 AM • Last activity: Feb 16, 2021, 03:51 AM
2 votes
1 answers
5355 views
Pacemaker - Corosync - HA - Simple Custom Resource Testing - Status flapping - Started - Failed - Stopped - Started
I am testing using the OCF:Heartbeat:Dummy script and I want to make a very basic setup just to know it works and build on that. The only information I can find was this web blog here. https://raymii.org/s/tutorials/Corosync_Pacemaker_-_Execute_a_script_on_failover.html It has some typos but basical...
I am testing using the OCF:Heartbeat:Dummy script and I want to make a very basic setup just to know it works and build on that. The only information I can find was this web blog here. https://raymii.org/s/tutorials/Corosync_Pacemaker_-_Execute_a_script_on_failover.html It has some typos but basically worked for me. The script currently just contains the following : sudo nano /usr/local/bin/failover.sh && sudo chmod +x /usr/local/bin/failover.sh #!/bin/sh touch /tmp/testfailover.sh Here is my setup : cp /usr/lib/ocf/resource.d/heartbeat/Dummy /usr/lib/ocf/resource.d/heartbeat/FailOverScript sudo nano /usr/lib/ocf/resource.d/heartbeat/FailOverScript dummy_start() { dummy_monitor /usr/local/bin/failover.sh if [ $? = $OCF_SUCCESS ]; then return $OCF_SUCCESS fi touch ${OCF_RESKEY_state} } sed -i 's/Dummy/FailOverScript/g' /usr/lib/ocf/resource.d/heartbeat/FailOverScript sed -i 's/dummy/FailOverScript/g' /usr/lib/ocf/resource.d/heartbeat/FailOverScript pcs resource create FailOverScript ocf:heartbeat:FailOverScript op monitor interval="30" The only testing I can really do : [root@node2 ~]# /usr/lib/ocf/resource.d/heartbeat/FailOverScript start ; echo $? DEBUG: default start : 0 0 ocf-tester doesn't seem to exist in the latest HA Software Suite, not really sure how to manually install it, but the script "half works". **The script doesn't need monitoring, its supposed to be very basic, but it seems to be flapping and giving me the following error code. Any idea's what to do?** FailOverScript (ocf::heartbeat:FailOverScript): Started node2 Failed Actions: * FailOverScript_monitor_30000 on node2 'not running' (7): call= 24423, status=complete, exitreason='none', last-rc-change='Tue Aug 16 15:53:50 2016', queued=0ms, exec= 9ms **Example of what I want to do:** Cluster start Script runs "start.sh" Cluster fails over to node2. On node1 script runs "fail.sh" On node2 script runs "start.sh" and vis versa if it fails the other direction. Note: The script does work, I get /tmp/testfailover.sh. I even tried putting another script under dummy_stop to remove the file and that worked, but it just keeps flapping along removing/adding/removing/adding file and starting/failing/stoping/starting etc etc. Thanks for reading!
FreeSoftwareServers (2682 rep)
Aug 16, 2016, 07:56 PM • Last activity: Dec 21, 2020, 06:56 AM
0 votes
1 answers
1741 views
Pacemaker apache resource is Failed to access httpd status page after change to HTTPS
I get this error from pacemaker after i change apache from http to https. now my ocf::heartbeat:apache resource is not find status page. I generate SSL certificate separately for 3 servers. Everything was working fine when running on http but as soon as I added the (self-signed) SSL certificate pace...
I get this error from pacemaker after i change apache from http to https. now my ocf::heartbeat:apache resource is not find status page. I generate SSL certificate separately for 3 servers. Everything was working fine when running on http but as soon as I added the (self-signed) SSL certificate pacemaker Apache (ocf::heartbeat:apache): Stopped And error shows Failed Actions: * Apache_start_0 on server3 'unknown error' (1): call=315, status=complete, exitreason='Failed to access httpd status page.', last-rc-change='Mon Sep 21 16:22:37 2020', queued=0ms, exec=3456ms * Apache_start_0 on server1 'unknown error' (1): call=59, status=complete, exitreason='Failed to access httpd status page.', last-rc-change='Mon Sep 21 16:22:41 2020', queued=0ms, exec=3421ms * Apache_start_0 on server2 'unknown error' (1): call=197, status=complete, exitreason='Failed to access httpd status page.', last-rc-change='Mon Sep 21 16:22:33 2020', queued=0ms, exec=3451ms /etc/apache2/sites-available/000-default.conf ServerAdmin webmaster@localhost DocumentRoot /var/www/html Redirect "/" "https://10.226.***.***/ " SetHandler server-status ServerAdmin webmaster@localhost DocumentRoot /var/www/html Redirect "/" "https://10.226.179.205/ " Order deny,allow Deny from all Allow from 127.0.0.1 *pcs resource debug-monitor --full Apache* Operation monitor for Apache (ocf:heartbeat:apache) returned 1 > stderr: + echo > stderr: + printenv > stderr: + sort > stderr: + env= > stderr: AONIX_LM_DIR=/home/TeleUSE/etc > stderr: BXwidgets=/home/BXwidgets > stderr: HA_logfacility=none > stderr: HOME=/root > stderr: LC_ALL=C > stderr: LOGNAME=root > stderr: MAIL=/var/mail/root > stderr: OCF_EXIT_REASON_PREFIX=ocf-exit-reason: > stderr: OCF_RA_VERSION_MAJOR=1 > stderr: OCF_RA_VERSION_MINOR=0 > stderr: OCF_RESKEY_CRM_meta_class=ocf > stderr: OCF_RESKEY_CRM_meta_id=Apache > stderr: OCF_RESKEY_CRM_meta_migration_threshold=5 > stderr: OCF_RESKEY_CRM_meta_provider=heartbeat > stderr: OCF_RESKEY_CRM_meta_resource_stickiness=10 > stderr: OCF_RESKEY_CRM_meta_type=apache > stderr: OCF_RESKEY_configfile=/etc/apache2/apache2.conf > stderr: OCF_RESKEY_statusurl=http://localhost/server-status > stderr: OCF_RESOURCE_INSTANCE=Apache > stderr: OCF_RESOURCE_PROVIDER=heartbeat > stderr: OCF_RESOURCE_TYPE=apache > stderr: OCF_ROOT=/usr/lib/ocf > stderr: OCF_TRACE_RA=1 > stderr: PATH=/root/.rbenv/shims:/root/.rbenv/bin:/root/.rbenv/shims:/root/.rbenv/bin:/usr/local/bin:/home/TeleUSE/bin:/home/xrt/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/sbin:/bin:/usr/sbin:/usr/bin:/usr/ucb > stderr: PCMK_logfacility=none > stderr: PCMK_service=crm_resource > stderr: PWD=/root > stderr: RBENV_SHELL=bash > stderr: SHELL=/bin/bash > stderr: SHLVL=1 > stderr: SSH_CLIENT=10.12.116.46 63097 22 > stderr: SSH_CONNECTION=10.12.116.46 63097 10.226.179.205 22 > stderr: SSH_TTY=/dev/pts/0 > stderr: TERM=xterm > stderr: TeleUSE=/home/TeleUSE > stderr: USER=root > stderr: _=/usr/sbin/pcs > stderr: __OCF_TRC_DEST= > stderr: __OCF_TRC_MANAGE= > stderr: + ocf_is_true > stderr: + false > stderr: + . /usr/lib/ocf/lib/heartbeat/apache-conf.sh > stderr: + . /usr/lib/ocf/lib/heartbeat/http-mon.sh > stderr: + bind_address=127.0.0.1 > stderr: + curl_ipv6_opts= > stderr: + ocf_is_true > stderr: + false > stderr: + echo > stderr: + grep -qs :: > stderr: + WGETOPTS=-O- -q -L --no-proxy --bind-address=127.0.0.1 > stderr: + CURLOPTS=-o - -Ss -L --interface lo > stderr: + HA_VARRUNDIR=/var/run > stderr: + IBMHTTPD=/opt/IBMHTTPServer/bin/httpd > stderr: + HTTPDLIST=/sbin/httpd2 /usr/sbin/httpd2 /usr/sbin/apache2 /sbin/httpd /usr/sbin/httpd /usr/sbin/apache /opt/IBMHTTPServer/bin/httpd > stderr: + MPM=/usr/share/apache2/find_mpm > stderr: + [ -x /usr/share/apache2/find_mpm ] > stderr: + LOCALHOST=http://localhost > stderr: + HTTPDOPTS=-DSTATUS > stderr: + DEFAULT_IBMCONFIG=/opt/IBMHTTPServer/conf/httpd.conf > stderr: + DEFAULT_SUSECONFIG=/etc/apache2/httpd.conf > stderr: + DEFAULT_RHELCONFIG=/etc/httpd/conf/httpd.conf > stderr: + DEFAULT_DEBIANCONFIG=/etc/apache2/apache2.conf > stderr: + basename /usr/lib/ocf/resource.d/heartbeat/apache > stderr: + CMD=apache > stderr: + OCF_REQUIRED_PARAMS= > stderr: + OCF_REQUIRED_BINARIES= > stderr: + ocf_rarun monitor > stderr: + mk_action_func > stderr: + echo apache_monitor > stderr: + tr - _ > stderr: + ACTION_FUNC=apache_monitor > stderr: + validate_args > stderr: + is_function apache_monitor > stderr: + command -v apache_monitor > stderr: + test zapache_monitor = zapache_monitor > stderr: + simple_actions > stderr: + check_required_params > stderr: + local v > stderr: + run_function apache_getconfig > stderr: + is_function apache_getconfig > stderr: + command -v apache_getconfig > stderr: + test zapache_getconfig = zapache_getconfig > stderr: + apache_getconfig > stderr: + HTTPD= > stderr: + PORT= > stderr: + STATUSURL=http://localhost/server-status > stderr: + CONFIGFILE=/etc/apache2/apache2.conf > stderr: + OPTIONS= > stderr: + CLIENT= > stderr: + TESTREGEX= > stderr: + TESTURL= > stderr: + TESTREGEX10= > stderr: + TESTCONFFILE= > stderr: + TESTNAME= > stderr: + : /etc/apache2/envvars > stderr: + source_envfiles /etc/apache2/envvars > stderr: + [ -f /etc/apache2/envvars -a -r /etc/apache2/envvars ] > stderr: + . /etc/apache2/envvars > stderr: + unset HOME > stderr: + [ != ] > stderr: + SUFFIX= > stderr: + export APACHE_RUN_USER=www-data > stderr: + export APACHE_RUN_GROUP=www-data > stderr: + export APACHE_PID_FILE=/var/run/apache2/apache2.pid > stderr: + export APACHE_RUN_DIR=/var/run/apache2 > stderr: + export APACHE_LOCK_DIR=/var/lock/apache2 > stderr: + export APACHE_LOG_DIR=/var/log/apache2 > stderr: + export LANG=C > stderr: + export LANG > stderr: + [ X = X -o ! -f -o ! -x ] > stderr: + find_httpd_prog > stderr: + HTTPD= > stderr: + [ -f /sbin/httpd2 -a -x /sbin/httpd2 ] > stderr: + [ -f /usr/sbin/httpd2 -a -x /usr/sbin/httpd2 ] > stderr: + [ -f /usr/sbin/apache2 -a -x /usr/sbin/apache2 ] > stderr: + HTTPD=/usr/sbin/apache2 > stderr: + break > stderr: + [ X != X -a X/usr/sbin/apache2 != X ] > stderr: + detect_default_config > stderr: + [ -f /etc/apache2/httpd.conf ] > stderr: + [ -f /etc/apache2/apache2.conf ] > stderr: + echo /etc/apache2/apache2.conf > stderr: + DefaultConfig=/etc/apache2/apache2.conf > stderr: + CONFIGFILE=/etc/apache2/apache2.conf > stderr: + [ -n /usr/sbin/apache2 ] > stderr: + basename /usr/sbin/apache2 > stderr: + httpd_basename=apache2 > stderr: + GetParams /etc/apache2/apache2.conf > stderr: + ConfigFile=/etc/apache2/apache2.conf > stderr: + [ ! -f /etc/apache2/apache2.conf ] > stderr: + get_apache_params /etc/apache2/apache2.conf ServerRoot PidFile Port Listen > stderr: + configfile=/etc/apache2/apache2.conf > stderr: + shift 1 > stderr: + echo ServerRoot PidFile Port Listen > stderr: + sed s/ /,/g > stderr: + vars=ServerRoot,PidFile,Port,Listen > stderr: + apachecat /etc/apache2/apache2.conf > stderr: + awk -v vars=ServerRoot,PidFile,Port,Listen > stderr: BEGIN{ > stderr: split(vars,v,","); > stderr: for( i in v ) > stderr: vl[i]=tolower(v[i]); > stderr: } > stderr: { > stderr: for( i in v ) > stderr: if( tolower($1)==vl[i] ) { > stderr: print v[i]"="$2 > stderr: delete vl[i] > stderr: break > stderr: } > stderr: } > stderr: > stderr: + awk > stderr: function procline() { > stderr: split($0,a); > stderr: if( a~/^[Ii]nclude$/ ) { > stderr: includedir=a; > stderr: gsub("\"","",includedir); > stderr: procinclude(includedir); > stderr: } else { > stderr: if( a=="ServerRoot" ) { > stderr: rootdir=a; > stderr: gsub("\"","",rootdir); > stderr: } > stderr: print; > stderr: } > stderr: } > stderr: function printfile(infile, a) { > stderr: while( (getline 0 ) { > stderr: procline(); > stderr: } > stderr: close(infile); > stderr: } > stderr: function allfiles(dir, cmd,f) { > stderr: cmd="find -L "dir" -type f"; > stderr: while( ( cmd | getline f ) > 0 ) { > stderr: printfile(f); > stderr: } > stderr: close(cmd); > stderr: } > stderr: function listfiles(pattern, cmd,f) { > stderr: cmd="ls "pattern" 2>/dev/null"; > stderr: while( ( cmd | getline f ) > 0 ) { > stderr: printfile(f); > stderr: } > stderr: close(cmd); > stderr: } > stderr: function procinclude(spec) { > stderr: if( rootdir!="" && spec!~/^\// ) { > stderr: spec=rootdir"/"spec; > stderr: } > stderr: if( isdir(spec) ) { > stderr: allfiles(spec); # read all files in a directory (and subdirs) > stderr: } else { > stderr: listfiles(spec); # there could be jokers > stderr: } > stderr: } > stderr: function isdir(s) { > stderr: return !system("test -d \""s"\""); > stderr: } > stderr: { procline(); } > stderr: /etc/apache2/apache2.conf > stderr: + sed s/#.*//;s/[[:blank:]]*$//;s/^[[:blank:]]*// > stderr: + grep -v ^$ > stderr: + eval PidFile=${APACHE_PID_FILE} > stderr: + PidFile=/var/run/apache2/apache2.pid > stderr: + CheckPort > stderr: + ocf_is_decimal > stderr: + false > stderr: + CheckPort > stderr: + ocfError performing operation: Operation not permitted _is_decimal > stderr: + false > stderr: + CheckPort 80 > stderr: + ocf_is_decimal 80 > stderr: + true > stderr: + [ 80 -gt 0 ] > stderr: + PORT=80 > stderr: + break > stderr: + echo > stderr: + grep : > stderr: + Listen=localhost: > stderr: + [ Xhttp://localhost/server-status = X ] > stderr: + test /var/run/apache2/apache2.pid > stderr: + return 0 > stderr: + validate_env > stderr: + check_required_binaries > stderr: + local v > stderr: + is_function apache_validate_all > stderr: + command -v apache_validate_all > stderr: + test zapache_validate_all = zapache_validate_all > stderr: + local rc > stderr: + LSB_STATUS_STOPPED=3 > stderr: + apache_validate_all > stderr: + [ -z /usr/sbin/apache2 ] > stderr: + [ ! -x /usr/sbin/apache2 ] > stderr: + [ ! -f /etc/apache2/apache2.conf ] > stderr: + [ -n ] > stderr: + [ -n ] > stderr: + dirname /var/run/apache2/apache2.pid > stderr: + local a > stderr: + local b > stderr: + [ 1 = 1 ] > stderr: + a=/var/run/apache2/apache2.pid > stderr: + [ 1 ] > stderr: + b=/var/run/apache2/apache2.pid > stderr: + [ /var/run/apache2/apache2.pid = /var/run/apache2/apache2.pid ] > stderr: + break > stderr: + b=/var/run/apache2 > stderr: + [ -z /var/run/apache2 -o /var/run/apache2/apache2.pid = /var/run/apache2 ] > stderr: + echo /var/run/apache2 > stderr: + return 0 > stderr: + ocf_mkstatedir root 755 /var/run/apache2 > stderr: + local owner > stderr: + local perms > stderr: + local path > stderr: + owner=root > stderr: + perms=755 > stderr: + path=/var/run/apache2 > stderr: + test -d /var/run/apache2 > stderr: + return 0 > stderr: + return 0 > stderr: + rc=0 > stderr: + [ 0 -ne 0 ] > stderr: + ocf_is_probe > stderr: + [ monitor = monitor -a 0 = 0 ] > stderr: + run_probe > stderr: + is_function apache_probe > stderr: + command -v apache_probe > stderr: + test z = zapache_probe > stderr: + shift 1 > stderr: + apache_monitor > stderr: + silent_status > stderr: + local pid > stderr: + get_pid > stderr: + [ -f /var/run/apache2/apache2.pid ] > stderr: + cat /var/run/apache2/apache2.pid > stderr: + pid=17552 > stderr: + [ -n 17552 ] > stderr: + ProcessRunning 17552 > stderr: + local pid=17552 > stderr: + [ -d /proc -a -d /proc/1 ] > stderr: + [ -d /proc/17552 ] > stderr: + [ 0 -ne 0 ] > stderr: + findhttpclient > stderr: + [ x != x ] > stderr: + which wget > stderr: + echo wget > stderr: + ourhttpclient=wget > stderr: + [ -z wget ] > stderr: + ocf_check_level 10 > stderr: + local lvl prev > stderr: + lvl=0 > stderr: + prev=0 > stderr: + ocf_is_decimal 0 > stderr: + true > stderr: + [ 10 -eq 0 ] > stderr: + [ 10 -gt 0 ] > stderr: + lvl=0 > stderr: + break > stderr: + echo 0 > stderr: + apache_monitor_basic > stderr: + wget_func http://localhost/server-status > stderr: + auth= > stderr: + cl_opts=-O- -q -L --no-proxy --bind-address=127.0.0.1 > stderr: + [ x !=+ x ] > stderr: grep+ wget -Ei -O- -q > stderr: -L --no-proxy --bind-address=127.0.0.1 http://localhost/server-status > stderr: + attempt_index_monitor_request > stderr: + local indexpage= > stderr: + [ -n ] > stderr: + [ -n ] > stderr: + [ -n ] > stderr: + [ -n http://localhost/server-status ] > stderr: + return 1 > stderr: + [ 1 -eq 0 ] > stderr: + ocf_is_probe > stderr: + [ monitor = monitor -a 0 = 0 ] > stderr: + return 1 **pcs config** Resource: MasterVip (class=ocf provider=heartbeat type=IPaddr2) Attributes: ip=10.226.***.*** nic=lo cidr_netmask=32 iflabel=pgrepvip Meta Attrs: target-role=Started Operations: start interval=0s timeout=20s (MasterVip-start-interval-0s) stop interval=0s timeout=20s (MasterVip-stop-interval-0s) monitor interval=90s (MasterVip-monitor-interval-90s) Resource: Apache (class=ocf provider=heartbeat type=apache) Attributes: configfile=/etc/apache2/apache2.conf statusurl=http://localhost/server-status Operations: start interval=0s timeout=40s (Apache-start-interval-0s) stop interval=0s timeout=60s (Apache-stop-interval-0s) monitor interval=1min (Apache-monitor-interval-1min) I don't know how to fix this. if anyone knows please help me.
Karippery (1 rep)
Sep 21, 2020, 03:04 PM • Last activity: Sep 22, 2020, 11:36 AM
1 votes
2 answers
4955 views
Keepalived not working?
I'm trying to create HA for HAProxy using keepalived on CentOS 8, here's what I have: Virtual IP: 10.10.10.14 HAProxy Server 1: 10.10.10.15 HAProxy Server 2: 10.10.10.18 and my keepalived configuration on **MASTER**: vrrp_script chk_haproxy { script "killall -0 haproxy" # check the haproxy process i...
I'm trying to create HA for HAProxy using keepalived on CentOS 8, here's what I have: Virtual IP: 10.10.10.14 HAProxy Server 1: 10.10.10.15 HAProxy Server 2: 10.10.10.18 and my keepalived configuration on **MASTER**: vrrp_script chk_haproxy { script "killall -0 haproxy" # check the haproxy process interval 2 # every 2 seconds weight 2 # add 2 points if OK } vrrp_instance VI_1 { interface ens190 state MASTER virtual_router_id 51 priority 101 virtual_ipaddress { 10.10.10.14 } track_script { chk_haproxy } } Keepalived config on **BACKUP**: vrrp_script chk_haproxy { script "killall -0 haproxy" # check the haproxy process interval 2 # every 2 seconds weight 2 # add 2 points if OK } vrrp_instance VI_1 { interface ens165 state BACKUP virtual_router_id 51 priority 100 virtual_ipaddress { 10.10.10.14 } track_script { chk_haproxy } } But every time I try to stop my HAProxy process it won't connect to the backup server. Instead it only works on the server with the recent start of keepalived. My ip -a command would return like this for **Master**: inet 10.10.10.15/24 brd 10.10.10.255 scope global noprefixroute ens190 inet 10.10.10.14/32 scope global ens190 For **Backup**: inet 10.10.10.18/24 brd 10.10.10.255 scope global noprefixroute ens165 inet 10.10.10.14/32 scope global ens165 Anything wrong? I have also set net.ipv4.ip_nonlocal_bind = 1 on my sysctl configuration. My logs only show the start and stop of the service?
Gwynn (41 rep)
Jul 15, 2020, 05:35 AM • Last activity: Jul 15, 2020, 11:51 PM
2 votes
3 answers
1346 views
What is the best way to store a single counter persistently?
I have a simple bash script that increments a counter a few times per second, guaranteed less than 100 times per second. The script works fine, but I would like the counter to persist on machine crashes. What would be the best way to persist the counter on my SSD-only system? Should I just echo it o...
I have a simple bash script that increments a counter a few times per second, guaranteed less than 100 times per second. The script works fine, but I would like the counter to persist on machine crashes. What would be the best way to persist the counter on my SSD-only system? Should I just echo it out to /var// somewhere (i.e. store in a file) each time it updates? If so, is /var// the right place? Do I need to install a full database to keep track of this single value? Is there some cute little Linux feature built to do this effectively? To clarify, my problem isn't making sure that the counter is persistent between separate runs of the script, I have that solved already. My concern is in case the system unexpectedly and suddenly fails due to machine crash (I can therefore not rely on a trap in a shell script).
00prometheus (813 rep)
Jun 1, 2020, 05:49 PM • Last activity: Jun 3, 2020, 11:50 AM
0 votes
2 answers
1078 views
Linux Pacemaker: Resource showing as "unrunnable start (blocked)" has been created
We are using SLES 12 SP4 We have observed few things from the today testing. Following are the steps: **Step 1**: When we create kernel panic (on Node01) with the command “**echo 'b' > /proc/sysrq-trigger**” or “**echo 'c' > /proc/sysrq-trigger**” on the node where the resources are running, then th...
We are using SLES 12 SP4 We have observed few things from the today testing. Following are the steps: **Step 1**: When we create kernel panic (on Node01) with the command “**echo 'b' > /proc/sysrq-trigger**” or “**echo 'c' > /proc/sysrq-trigger**” on the node where the resources are running, then the cluster detecting the change but unable to start any resources (except SBD) on other active node. **Step 2**: As per the logs we can find the following errors:
pengine:     info: LogActions:       Leave      stonith-sbd           (Started node02)
pengine:   notice: LogAction:      * Start      pri-javaiq            (node02 )   due to unrunnable nfs_filesystem start (blocked)
pengine:   notice: LogAction:      * Start      lb_health_probe       (node02 )   due to unrunnable nfs_filesystem start (blocked)
pengine:   notice: LogAction:      * Start      pri-ip_vip            (node02 )   due to unrunnable nfs_filesystem start (blocked)
pengine:   notice: LogAction:      * Start      nfs_filesystem        (node02 )   blocked
**Step 3**: But when we execute “init 6” on the node (on which we have created ‘kernel panic’), surprisingly the resources on other node are starting and running successfully.
Ram Too (1 rep)
May 14, 2020, 02:25 PM • Last activity: May 15, 2020, 04:47 PM
0 votes
0 answers
35 views
high availability of file in Linux
I have a very specific scenario. I have set of config files and I want to maintain it at two different paths which would be nothing but a copy of the same files. For some reason, if one of the location becomes unavailable my process should still access the files from the second location. How can I a...
I have a very specific scenario. I have set of config files and I want to maintain it at two different paths which would be nothing but a copy of the same files. For some reason, if one of the location becomes unavailable my process should still access the files from the second location. How can I achieve this scenario through something like a symbolic link that takes care of this file pointer based on the availability. Any thoughts or ideas are highly appreciated. Thanks much -
Parthi (1 rep)
Apr 30, 2020, 06:06 AM
1 votes
0 answers
310 views
OpenLDAP Cluster
Trying to implement an OpenLDAP cluster, I already managed to set up the two backend LDAP servers in mirroring mode. The application (iRedMail) using the LDAP service is running on the same systems as the LDAP servers. This applications needs the LDAP configuration in the former slapd.conf manner an...
Trying to implement an OpenLDAP cluster, I already managed to set up the two backend LDAP servers in mirroring mode. The application (iRedMail) using the LDAP service is running on the same systems as the LDAP servers. This applications needs the LDAP configuration in the former slapd.conf manner and not in the CONFIG-DB way. So I added the mirroring parameters to the slapd.conf file. The file looks like this on the first backend node:
include     /etc/openldap/schema/core.schema
include     /etc/openldap/schema/corba.schema
include     /etc/openldap/schema/cosine.schema
include     /etc/openldap/schema/inetorgperson.schema
include     /etc/openldap/schema/nis.schema
include     /etc/openldap/schema/calentry.schema
include     /etc/openldap/schema/calresource.schema
include     /etc/openldap/schema/amavisd-new.schema
include     /etc/openldap/schema/iredmail.schema

pidfile     /var/run/openldap/slapd.pid
argsfile    /var/run/openldap/slapd.args

# The syncprov overlay
moduleload syncprov.la

disallow    bind_anon
require     LDAPv3
loglevel    0

access to attrs="userPassword,mailForwardingAddress,employeeNumber"
    by anonymous    auth
    by self         write
    by dn.exact="cn=vmail,dc=myCompany,dc=de"   read
    by dn.exact="cn=vmailadmin,dc=myCompany,dc=de"  write
    by users        none

access to attrs="cn,sn,gn,givenName,telephoneNumber"
    by anonymous    auth
    by self         write
    by dn.exact="cn=vmail,dc=myCompany,dc=de"   read
    by dn.exact="cn=vmailadmin,dc=myCompany,dc=de"  write
    by users        read

access to attrs="objectclass,domainName,mtaTransport,enabledService,domainSenderBccAddress,domainRecipientBccAddress,domainBackupMX,domainMaxQuotaSize,domainMaxUserNumber,domainPendingAliasName"
    by anonymous    auth
    by self         read
    by dn.exact="cn=vmail,dc=myCompany,dc=de"   read
    by dn.exact="cn=vmailadmin,dc=myCompany,dc=de"  write
    by users        read

access to attrs="domainAdmin,domainGlobalAdmin,domainSenderBccAddress,domainRecipientBccAddress"
    by anonymous    auth
    by self         read
    by dn.exact="cn=vmail,dc=myCompany,dc=de"   read
    by dn.exact="cn=vmailadmin,dc=myCompany,dc=de"  write
    by users        none

access to attrs="mail,accountStatus,domainStatus,userSenderBccAddress,userRecipientBccAddress,mailQuota,backupMailAddress,shadowAddress,memberOfGroup,member,uniqueMember,storageBaseDirectory,homeDirectory,mailMessageStore,mailingListID"
    by anonymous    auth
    by self         read
    by dn.exact="cn=vmail,dc=myCompany,dc=de"   read
    by dn.exact="cn=vmailadmin,dc=myCompany,dc=de"  write
    by users        read

access to dn="cn=vmail,dc=myCompany,dc=de"
    by anonymous                    auth
    by self                         write
    by users                        none

access to dn="cn=vmailadmin,dc=myCompany,dc=de"
    by anonymous                    auth
    by self                         write
    by users                        none

access to dn.regex="domainName=([^,]+),o=domains,dc=myCompany,dc=de$"
    by anonymous                    auth
    by self                         write
    by dn.exact="cn=vmail,dc=myCompany,dc=de"   read
    by dn.exact="cn=vmailadmin,dc=myCompany,dc=de"  write
    by dn.regex="mail=[^,]+@$1,o=domainAdmins,dc=myCompany,dc=de$" write
    by dn.regex="mail=[^,]+@$1,ou=Users,domainName=$1,o=domains,dc=myCompany,dc=de$" read
    by users                        none

access to dn.subtree="o=domains,dc=myCompany,dc=de"
    by anonymous                    auth
    by self                         write
    by dn.exact="cn=vmail,dc=myCompany,dc=de"    read
    by dn.exact="cn=vmailadmin,dc=myCompany,dc=de"  write
    by users                        read

access to dn.subtree="o=domainAdmins,dc=myCompany,dc=de"
    by anonymous                    auth
    by self                         write
    by dn.exact="cn=vmail,dc=myCompany,dc=de"    read
    by dn.exact="cn=vmailadmin,dc=myCompany,dc=de"  write
    by users                        none

access to dn.regex="cn=[^,]+,dc=myCompany,dc=de"
    by anonymous                    auth
    by self                         write
    by users                        none

access to *
    by anonymous                    auth
    by self                         write
    by users                        read

database monitor
access to dn="cn=monitor"
    by dn.exact="cn=Manager,dc=myCompany,dc=de" read
    by dn.exact="cn=vmail,dc=myCompany,dc=de" read
    by * none

database    mdb
suffix      dc=myCompany,dc=de
directory   /var/lib/ldap/myCompany.de
rootdn      cn=Manager,dc=myCompany,dc=de
rootpw      {SSHA}V5/UQXm9SmzRGjKK2zAKB79eFSaysc2wG9tPIg==
sizelimit   unlimited
maxsize     2147483648
checkpoint  128 3
mode        0700

index objectclass,entryCSN,entryUUID                eq
index uidNumber,gidNumber,uid,memberUid,loginShell  eq,pres
index homeDirectory,mailMessageStore                eq,pres
index ou,cn,mail,surname,givenname,telephoneNumber,displayName  eq,pres,sub
index nisMapName,nisMapEntry                        eq,pres,sub
index shadowLastChange                              eq,pres
index member,uniqueMember eq,pres

index domainName,mtaTransport,accountStatus,enabledService,disabledService  eq,pres,sub
index domainAliasName    eq,pres,sub
index domainMaxUserNumber eq,pres
index domainAdmin,domainGlobalAdmin,domainBackupMX    eq,pres,sub
index domainSenderBccAddress,domainRecipientBccAddress  eq,pres,sub

index accessPolicy,hasMember,listAllowedUser,mailingListID   eq,pres,sub

index mailForwardingAddress,shadowAddress   eq,pres,sub
index backupMailAddress,memberOfGroup   eq,pres,sub
index userRecipientBccAddress,userSenderBccAddress  eq,pres,sub
index mobile,departmentNumber eq,pres,sub

#Mirror Mode
serverID    001

# Consumer
syncrepl rid=001 \
provider=ldap://rm2.myCompany.de \
bindmethod=simple \
binddn="cn=vmail,dc=myCompany,dc=de" \
credentials="gtV9FwILIcp8Zw8YtGeB1AC9GbGfti" \
searchbase="dc=myCompany,dc=de" \
attrs="*,+" \
type=refreshAndPersist \
interval=00:00:01:00 \
retry="60 +"
# Provider
overlay syncprov
syncprov-checkpoint 50 1
syncprov-sessionlog 50

mirrormode on
There are only two differences in the second node's config file:
[...]
#Mirror Mode
serverID    002
[...]

# Consumer
[...]
provider=ldap://rm2.myCompany.de \
[...]
As mentioned before the mirroring works perfectly. Now I need a single connection address for the LDAP clients, i.e. web applications using LDAP as authentication mechanism. I read that you can use an OpenLDAP proxy for that purpose. The LDAP client (here: web application) connects to the LDAP proxy and the proxy will retrieve the authentication data from multiple backend LDAP servers. I set up an OpenLDAP proxy, it uses CONFIG-DB, not the ancient way. The slapd.conf file looks like this:
include         /etc/openldap/schema/corba.schema
include         /etc/openldap/schema/core.schema
include         /etc/openldap/schema/cosine.schema
include         /etc/openldap/schema/duaconf.schema
include         /etc/openldap/schema/dyngroup.schema
include         /etc/openldap/schema/inetorgperson.schema
include         /etc/openldap/schema/java.schema
include         /etc/openldap/schema/misc.schema
include         /etc/openldap/schema/nis.schema
include         /etc/openldap/schema/openldap.schema
include         /etc/openldap/schema/ppolicy.schema
 
pidfile         /var/run/openldap/slapd.pid
argsfile        /var/run/openldap/slapd.args

modulepath  /usr/lib/openldap
modulepath  /usr/lib64/openldap
moduleload  back_ldap.la       
loglevel	0

database		ldap
readonly		yes            
protocol-version	3
rebind-as-user
uri			"ldap://rm1.myCompany.de:389"
suffix		        "dc=myCompany,dc=de"
uri                     "ldap://rm2.myCompany.de:389"
suffix		        "dc=myCompany,dc=de"
First issue: Creating the CONFIG-DB using slaptest, the command fails, claiming:
5dc44107 /etc/openldap/slapd.conf: line 48: suffix already served by this backend!.
slaptest: bad configuration directory!
The slaptest command looks like this:
slaptest -f /etc/openldap/slapd.conf -F /etc/openldap/slapd.d/
It is possible that I didn't understand completely the concept, because all guides I found are using subdomain prefixes for the different LDAP backend servers, i.e. instead of:
uri			"ldap://rm1.myCompany.de:389"
suffix		        "dc=myCompany,dc=de"
uri                     "ldap://rm2.myCompany.de:389"
suffix		        "dc=myCompany,dc=de"
they use:
uri            "ldap://rm1.myCompany.de:389"
suffix		   "dc=ou1,dc=myCompany,dc=de"
uri            "ldap://rm2.myCompany.de:389"
suffix		   "dc=ou2,dc=myCompany,dc=de"
What I didn't understand: On the backend servers there is no ou1 and ou2 respectively. How can they expect to find anything in the backend LDAPs if the DNs do not match? I temporarily commented the second uri in order to check if, apart from this issue, LDAP queries to the LDAP proxy succeed, but ran into the second issue. Second issue: If I run an ldapsearch against directly to the two backend LDAP servers (one after the other), all of the LDAP users will be enumerated. If I run the same ldapsearch against the LDAP proxy, only the user "vmail" will be enumerated. I think that the same users should be listed as in the direct query. This is the ldapsearch command:
ldapsearch -D "cn=vmail,dc=myCompany,dc=de" -w gtV9FwILIcp8Zw8YtGeB1AC9GbGfti -p 389 -h 192.168.0.92 -b "dc=myCompany,dc=de" -s sub "(objectclass=person)"
Did I miss sth.? Thank you for your considerations! Best regards, Florian
arminV (11 rep)
Nov 8, 2019, 10:45 AM
1 votes
1 answers
2580 views
Pacemaker: Primary node is rebooted and comes back is primary instead of standby
We are using pacemaker, corosync to automate failovers. We noticed one behaviour- when primary node is rebooted, the standby node takes over as primary - which is fine. When the node comes back online and services are started on it, it takes back the role of Primary. It should ideally start as stand...
We are using pacemaker, corosync to automate failovers. We noticed one behaviour- when primary node is rebooted, the standby node takes over as primary - which is fine. When the node comes back online and services are started on it, it takes back the role of Primary. It should ideally start as standby. Are we missing any configuration? > pcs resource defaults O/p: resource-stickiness: INFINITY migration-threshold: 0 Stickiness is set to INFINITY. Please suggest. Adding Config details: ======================
[root@Node1 heartbeat]# pcs config show –l
Cluster Name: cluster1
Corosync Nodes:
 Node1 Node2
Pacemaker Nodes:
 Node1 Node2

Resources:
 Master: msPostgresql
  Meta Attrs: master-node-max=1 clone-max=2 notify=true master-max=1 clone-node-max=1
  Resource: pgsql (class=ocf provider=heartbeat type=pgsql)
   Attributes: master_ip=10.70.10.1 node_list="Node1 Node2" pgctl=/usr/pgsql-9.6/bin/pg_ctl pgdata=/var/lib/pgsql/9.6/data/ primary_conninfo_opt="keepalives_idle=60 keepalives_interval=5 keepalives_count=5" psql=/usr/pgsql-9.6/bin/psql rep_mode=async restart_on_promote=true restore_command="cp /var/lib/pgsql/9.6/data/archivedir/%f %p"
   Meta Attrs: failure-timeout=60
   Operations: demote interval=0s on-fail=stop timeout=60s (pgsql-demote-interval-0s)
               methods interval=0s timeout=5s (pgsql-methods-interval-0s)
               monitor interval=4s on-fail=restart timeout=60s (pgsql-monitor-interval-4s)
               monitor interval=3s on-fail=restart role=Master timeout=60s (pgsql-monitor-interval-3s)
               notify interval=0s timeout=60s (pgsql-notify-interval-0s)
               promote interval=0s on-fail=restart timeout=60s (pgsql-promote-interval-0s)
               start interval=0s on-fail=restart timeout=60s (pgsql-start-interval-0s)
               stop interval=0s on-fail=block timeout=60s (pgsql-stop-interval-0s)
 Group: master-group
  Resource: vip-master (class=ocf provider=heartbeat type=IPaddr2)
   Attributes: cidr_netmask=24 ip=10.70.10.2
   Operations: monitor interval=10s on-fail=restart timeout=60s (vip-master-monitor-interval-10s)
               start interval=0s on-fail=restart timeout=60s (vip-master-start-interval-0s)
               stop interval=0s on-fail=block timeout=60s (vip-master-stop-interval-0s)
  Resource: vip-rep (class=ocf provider=heartbeat type=IPaddr2)
   Attributes: cidr_netmask=24 ip=10.70.10.1
   Meta Attrs: migration-threshold=0
   Operations: monitor interval=10s on-fail=restart timeout=60s (vip-rep-monitor-interval-10s)
               start interval=0s on-fail=stop timeout=60s (vip-rep-start-interval-0s)
               stop interval=0s on-fail=ignore timeout=60s (vip-rep-stop-interval-0s)

Stonith Devices:
Fencing Levels:

Location Constraints:
Ordering Constraints:
  promote msPostgresql then start master-group (score:INFINITY) (non-symmetrical)
  demote msPostgresql then stop master-group (score:0) (non-symmetrical)
Colocation Constraints:
  master-group with msPostgresql (score:INFINITY) (rsc-role:Started) (with-rsc-role:Master)
Ticket Constraints:

Alerts:
 No alerts defined

Resources Defaults:
 resource-stickiness: INFINITY
 migration-threshold: 0
Operations Defaults:
 No defaults set

Cluster Properties:
 cluster-infrastructure: corosync
 cluster-name: cluster1
 cluster-recheck-interval: 60
 dc-version: 1.1.19-8.el7-c3c624ea3d
 have-watchdog: false
 no-quorum-policy: ignore
 start-failure-is-fatal: false
 stonith-enabled: false
Node Attributes:
 Node1: pgsql-data-status=STREAMING|ASYNC
 Node2: pgsql-data-status=LATEST

Quorum:
  Options:

Thanks !
User2019 (11 rep)
Sep 12, 2019, 09:30 AM • Last activity: Sep 16, 2019, 06:18 PM
1 votes
1 answers
1268 views
how to unexport NFS share on VCS HS cluster
**see imp update at bottom of orig. question. not sure how to unexport only the 'world' mountable share? I have a NFS server which had a share with world-mountable permissions. To make it mountable only by the clients on a subnet i added the share to /etc/exports, which was empty before. I am not su...
**see imp update at bottom of orig. question. not sure how to unexport only the 'world' mountable share? I have a NFS server which had a share with world-mountable permissions. To make it mountable only by the clients on a subnet i added the share to /etc/exports, which was empty before. I am not sure how the folder was shared before?? I put the entry in /etc/exports and shared again, but it is still showing world mountable share available. before: [root@nfsServer ~]# exportfs -v /export/home (rw,wdelay,no_root_squash,no_subtree_check) # ls -l /var/lib/nfs/xtab -rw-r--r-- 1 root root 0 Dec 15 2009 /var/lib/nfs/xtab # ls -l /proc/fs/nfs -r--r--r-- 1 root root 0 May 2 00:41 exports change: added following line to /etc/exports (which was empty before) /export/home 192.168.253.0/24(rw,wdelay,no_root_squash,no_subtree_check) then re-export folders: # exportfs -ra after: [root@nfsServer ~]# exportfs -v /export/home 192.168.253.0/24(rw,wdelay,no_root_squash,no_subtree_check) /export/home (rw,wdelay,no_root_squash,no_subtree_check) # cat /etc/exports /export/home 192.168.253.0/24(rw,wdelay,no_root_squash,no_subtree_check) # ls -l /var/lib/nfs/xtab -rw-r--r-- 1 root root 0 Dec 15 2009 /var/lib/nfs/xtab # ls -l /proc/fs/nfs -r--r--r-- 1 root root 0 May 2 00:41 exports [root@nfsServer ~]# ls -ltr /proc/fs/nfsd total 0 -rw------- 1 root root 0 Mar 1 2017 versions -rw------- 1 root root 0 Mar 1 2017 threads -rw------- 1 root root 0 Mar 1 2017 portlist -rw------- 1 root root 0 Mar 1 2017 nfsv4recoverydir -rw------- 1 root root 0 Mar 1 2017 nfsv4leasetime -rw------- 1 root root 0 Mar 1 2017 filehandle -r--r--r-- 1 root root 0 Mar 1 2017 exports [root@nfsServer ~]# cd /proc/fs/nfsd [root@nfsServer nfsd]# cat exports # Version 1.1 # Path Client(Flags) # IPs /export/home *,192.168.253.0/24(rw,no_root_squash,sync,wdelay,no_subtree_check) # cat versions +2 +3 -4 Note that it has * added in front of the /etc/exports entry. I want to know where is the "*" entry coming from and how to get rid of it. All help is appreciated. system: Red Hat Enterprise Linux Server release 5.5 (Tikanga) 2.6.18-194.el5 #1 SMP Tue Mar 16 21:52:39 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux *IMP: sorry i missed to clarify that this is NFS running on VCS HA on redhat 5.5. so when i restart nfs, i get err: # service nfs stop Shutting down NFS mountd: [ OK ] Shutting down NFS daemon: [ OK ] Shutting down NFS quotas: [ OK ] Shutting down NFS services: [ OK ] # service nfs start Starting NFS services: [ OK ] Starting NFS quotas: [ OK ] Starting NFS daemon: [FAILED] # service nfs start Starting NFS services: [ OK ] Starting NFS quotas: [ OK ] Starting NFS daemon: [FAILED] but when you check... # service nfs status rpc.mountd (pid 24103) is running... nfsd (pid 24052 24051 24050 24049 24048 24047 24046 24045) is running... rpc.rquotad (pid 22872 20490 19133) is running... I figured that in VCS main.cf this line sets up the 'nfs' share: but i am not sure how to add subnet restriction to it... Share share_home ( Options = "rw, no_root_squash" PathName = "/export/home" ) Thanks. Raj
Rajeev (256 rep)
May 2, 2018, 12:55 AM • Last activity: May 2, 2018, 05:04 PM
2 votes
0 answers
434 views
Setting up a kerberized HA NFS share
I'm trying to set up a kerberized NFS share from an HA cluster. I've successfully set up a krb-aware NFS share from a single server, I'm using a mostly identical configuration on the cluster. Exports file from working single server: /nfs *(rw,sec=krb5:krb5i:krb5p) Cluster resource configuration: # p...
I'm trying to set up a kerberized NFS share from an HA cluster. I've successfully set up a krb-aware NFS share from a single server, I'm using a mostly identical configuration on the cluster. Exports file from working single server: /nfs *(rw,sec=krb5:krb5i:krb5p) Cluster resource configuration: # pcs resource show nfs-export1 Resource: nfs-export1 (class=ocf provider=heartbeat type=exportfs) Attributes: clientspec=10.1.0.0/255.255.255.0 directory=/nfsshare/exports/export1 fsid=1 options=rw,sec=krb5:krb5i:krb5p,sync,no_root_squash Operations: monitor interval=10 timeout=20 (nfs-export1-monitor-interval-10) start interval=0s timeout=40 (nfs-export1-start-interval-0s) stop interval=0s timeout=120 (nfs-export1-stop-interval-0s) Client showmount to working single server: # showmount -e ceserv Export list for ceserv: /nfs * Client showmount to floating cluster name: # showmount -e hafloat Export list for hafloat: /nfsshare/exports/export1 10.1.0.0/255.255.255.0 /nfsshare/exports 10.1.0.0/255.255.255.0 Contents of client /etc/fstab: ceserv:/nfs /mnt/nfs nfs4 sec=krb5i,rw,proto=tcp,port=2049 hafloat.ncphotography.lan:export1 /nfsmount nfs4 sec=krb5i,rw,proto=tcp,port=2049 Results of mount -av command: # mount -av mount.nfs4: timeout set for Mon Dec 4 20:57:14 2017 mount.nfs4: trying text-based options 'sec=krb5i,proto=tcp,port=2049,vers=4.1,addr=10.1.0.24,clientaddr=10.1.0.23' /mnt/nfs : successfully mounted mount.nfs4: timeout set for Mon Dec 4 20:57:14 2017 mount.nfs4: trying text-based options 'sec=krb5i,proto=tcp,port=2049,vers=4.1,addr=10.1.0.29,clientaddr=10.1.0.23' mount.nfs4: mount(2): Operation not permitted mount.nfs4: Operation not permitted All firewalls have been disabled. All names resolve correctly to IP addresses within the 10.1.0.0/24 network, and all IP addresses reverse-resolve to the correct hostname.
John (17381 rep)
Dec 4, 2017, 09:05 PM • Last activity: Mar 6, 2018, 03:25 PM
Showing page 1 of 20 total questions