Unix & Linux Stack Exchange
Q&A for users of Linux, FreeBSD and other Unix-like operating systems
Latest Questions
0
votes
2
answers
2793
views
How do I mount a disk on /var/log directory even if I have process writing on it?
I would like to mount a disk on /var/log, the thing is, there are some process/services writing into it, such as openvpn, or system logs. Is there a way to mount a filesystem without having to restart the machine, or stopping the service? Many thanks
I would like to mount a disk on /var/log, the thing is, there are some process/services writing into it, such as openvpn, or system logs. Is there a way to mount a filesystem without having to restart the machine, or stopping the service?
Many thanks
LinuxEnthusiast
(1 rep)
Aug 10, 2020, 10:10 AM
• Last activity: Aug 1, 2025, 11:02 PM
0
votes
1
answers
2844
views
Apache resource failed to start in Pacemaker
I am using Pacemaker with Corosync to set up a basic Apache HA cluster with 3 nodes running CentOS7. For some reasons, I cannot get the apache resource started in pcs. Cluster IP: 192.168.200.40 # pcs resource show ClusterIP Resource: ClusterIP (class=ocf provider=heartbeat type=IPaddr2) Attributes:...
I am using Pacemaker with Corosync to set up a basic Apache HA cluster with 3 nodes running CentOS7. For some reasons, I cannot get the apache resource started in pcs.
Cluster IP: 192.168.200.40
# pcs resource show ClusterIP
Resource: ClusterIP (class=ocf provider=heartbeat type=IPaddr2)
Attributes: cidr_netmask=24 ip=192.168.200.40
Operations: monitor interval=20s (ClusterIP-monitor-interval-20s)
start interval=0s timeout=20s (ClusterIP-start-interval-0s)
stop interval=0s timeout=20s (ClusterIP-stop-interval-0s)
# pcs resource show WebServer
Resource: WebServer (class=ocf provider=heartbeat type=apache)
Attributes: configfile=/etc/httpd/conf/httpd.conf statusurl=http://localhost/server-status
Operations: monitor interval=1min (WebServer-monitor-interval-1min)
start interval=0s timeout=40s (WebServer-start-interval-0s)
stop interval=0s timeout=60s (WebServer-stop-interval-0s)
# pcs status
Cluster name:
WARNING: corosync and pacemaker node names do not match (IPs used in setup?)
Stack: corosync
Current DC: server3.example.com (version 1.1.18-11.el7_5.2-2b07d5c5a9) - partition with quorum
Last updated: Thu Jun 7 21:59:09 2018
Last change: Thu Jun 7 21:45:23 2018 by root via cibadmin on server1.example.com
3 nodes configured
2 resources configured
Online: [ server1.example.com server2.example.com server3.example.com ]
Full list of resources:
ClusterIP (ocf::heartbeat:IPaddr2): Started server2.example.com
WebServer (ocf::heartbeat:apache): Stopped
Failed Actions:
* WebServer_start_0 on server3.example.com 'unknown error' (1): call=49, status=Timed Out, exitreason='',
last-rc-change='Thu Jun 7 21:46:03 2018', queued=0ms, exec=40002ms
* WebServer_start_0 on server1.example.com 'unknown error' (1): call=53, status=Timed Out, exitreason='',
last-rc-change='Thu Jun 7 21:45:23 2018', queued=0ms, exec=40003ms
* WebServer_start_0 on server2.example.com 'unknown error' (1): call=47, status=Timed Out, exitreason='',
last-rc-change='Thu Jun 7 21:46:43 2018', queued=1ms, exec=40002ms
Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled
The httpd instance is **enabled** and **running** on all three nodes. The cluster IP and individual node IPs are able to access the web page. The ClusterIP resource also works well for failover. What may go wrong for the apache resource in this case?
Thank you very much!
Update:
Here is more information from the debug output. It seems the Apache is unable to bind to the port, but there is no error from the apache log, and
systemctl status httpd
gave all green on all nodes. I can open web pages via the cluster IP and each every node IP. The ClusterIP resource failover works fine, too. Any idea on why Apache resource doesn't work with pacemaker?
# pcs resource debug-start WebServer --full
Operation start for WebServer (ocf:heartbeat:apache) failed: 'Timed Out' (2)
> stderr: ERROR: (98)Address already in use: AH00072: make_sock: could not bind to address [::]:80 (98)Address already in use: AH00072: make_sock: could not bind to address 0.0.0.0:80 no listening sockets available, shutting down AH00015: Unable to open logs
> stderr: INFO: apache not running
> stderr: INFO: waiting for apache /etc/httpd/conf/httpd.conf to come up
> stderr: INFO: apache not running
> stderr: INFO: waiting for apache /etc/httpd/conf/httpd.conf to come up
> stderr: INFO: apache not running
> stderr: INFO: waiting for apache /etc/httpd/conf/httpd.conf to come up
> stderr: INFO: apache not running
cody
(67 rep)
Jun 8, 2018, 04:16 PM
• Last activity: Jul 15, 2025, 02:03 AM
1
votes
1
answers
2618
views
Pacemaker Virtual IP cannot be routed outside of its network
I have a server cluster consisted of following setup: 2 Virtual Servers with 2 NIC's. eth0 (private network 10.0.0.0/16) and eth1 (public network 77.1.2.0/24 with gateway as 77.1.2.1) For HA-01 VPS i have Private IP on eth0 set as 10.0.0.1 For HA-02 VPS i have Private IP set on eth0 as 10.0.0.2 Pace...
I have a server cluster consisted of following setup:
2 Virtual Servers with 2 NIC's. eth0 (private network 10.0.0.0/16) and eth1 (public network 77.1.2.0/24 with gateway as 77.1.2.1)
For HA-01 VPS i have Private IP on eth0 set as 10.0.0.1
For HA-02 VPS i have Private IP set on eth0 as 10.0.0.2
Pacemaker/Corosync Cluster has been established between private IP addresses and Virtual IP (77.1.2.4) defined as clone Resource (IPAddr2) so it can float between two nodes.
pcs resource create VirtualIP1 ocf:heartbeat:IPaddr2 ip="77.1.2.4" cidr_netmask="24" nic="eth1" clusterip_hash="sourceip-sourceport" op start interval="0s" timeout="60s" op monitor interval="1s" timeout="20s" op stop interval="0s" timeout="60s" clone interleave=true ordered=true
Problem is, i cannot reach that IP address from world. I noticed that there is a route missing, so i add the static route
ip r add default via 77.1.2.1 dev eth1
But i still cannot ping google.com from those servers nor world can see them on that IP.
I also tried adding IP addresses from same subnet on eth1 like this:
HA-01 eth1: 77.1.2.2
HA-02 eth1: 77.1.2.3
Servers can be seen on those IPs by world but if i add VirtualIP resource i cannot reach them on Virtual IP address.
I also tried adding a source ip in routing table
ip r add default via 77.1.2.1 src 77.1.2.4
to no avail. I don't know what am i supposed to do to get this VirtualIP working.
I can reach 77.1.2.4 (Virtual IP Address) from other servers on that network, but not outside that network.
Firewall is established and high availability ports are passed via command
firewall-cmd --add-service="high availability"; firewall-cmd --add-service="high availability" --permanent
Is there anything here that i am missing?
If i add that address (77.1.2.4 - Virtual IP) alone on the interface of only one of those servers, it will work.... So is there an issue with ARP table perhaps or maybe router blocking some traffic?
Marko Todoric
(437 rep)
Jul 19, 2019, 02:54 PM
• Last activity: Apr 15, 2025, 03:08 AM
0
votes
2
answers
2897
views
RHEL High-Availability Cluster using pcs, configuring service as a resource
I have a 2 node cluster on RHEL 6.9. Everything is configured except I'm having difficulty with an application launched via shell script that created into a service (in `/etc/init.d/myApplication`), which I'll just call "myApp" . From that application, I did a `pcs resource create myApp lsb:myApp op...
I have a 2 node cluster on RHEL 6.9. Everything is configured except I'm having difficulty with an application launched via shell script that created into a service (in
/etc/init.d/myApplication
), which I'll just call "myApp". From that application, I did a pcs resource create myApp lsb:myApp op monitor interval=30s op start on-fail=standby
. I am new to using this suite of software but it's for work. What I need is for this application to be launched on both nodes simultaneously as it has to be started manually so if the first node fails, it would need intervention if it were not already active on the passive node.
I have two other services:
-VirtIP (ocf:heartbeat:IPaddr2)
for providing a service IP for the application server
-Cron (lsb:crond)
to synchronize the application files (we are not using shared storage)
I have the VirtIP and Cron as dependents via colocation to myApp.
I've tried master/slave as well as cloning but I must be missing something regarding their config. If I take the application offline, pacemaker does not detect the service has gone down and pcs status
outputs that myApp is still running on the node (or nodes depending on my config). I'm also sometimes getting the issue that the service running the app is stopped by pacemaker on the passive node.
Which is the way I need to configure this? I've gone through the RHEL documentation but I'm still stuck. How do I get pacemaker to initiate failover if myApp service goes down? I don't know why it's not detecting the service has stopped in some cases.
EDIT: So for testing purposes, I removed the password requirement for starting/restarting and the service starts/restarts fine as expected and the colocation dependent resources stop/start as expected. But stopping the myApp service does not reflect as a stopped resource but simply stays at Started node1. Likewise, simulating a failover via putting node1 into standby simply stops all resources on node1.
Greg
(187 rep)
Sep 29, 2017, 07:52 AM
• Last activity: Sep 6, 2023, 09:56 PM
-1
votes
1
answers
717
views
IBM AIX - Method to identify Cluster or HA services
I am keen to learn if existing IBM AIX servers from different location have Clustering/HA features. Kindly let me know the steps to check. Thanks.
I am keen to learn if existing IBM AIX servers from different location have Clustering/HA features. Kindly let me know the steps to check. Thanks.
Nick eric adelee
(49 rep)
Dec 5, 2022, 09:01 AM
• Last activity: Dec 5, 2022, 06:36 PM
0
votes
0
answers
137
views
Options for high-availablility, high-throughput bonding in Linux
When trying to configure high-availablity (HA) bonding in Linux that should also use the bandwidth available, I wonder what the options are: The solution should ensure HA and optimal throughput (when all links are up) in a (simplified) scenario like this: [![Example Scenario for HA-Bonding][1]][1] S...
When trying to configure high-availablity (HA) bonding in Linux that should also use the bandwidth available, I wonder what the options are:
The solution should ensure HA and optimal throughput (when all links are up) in a (simplified) scenario like this:
So for example host **H1** has two interfaces **1** and **2**, also denoted as **H1.1** and **H1.2**.
Starting with a standard configuration like

active-backup
with miimon
link monitoring there are these problems:
- Only one interface is used at a time
- If **S1.3** fails, both **H1.1** and **H1.2** will see a valid link, but **H1.1** could not reach **H2** then
So the first step was to use arp_ip_target
for link monitoring to detect a possible inter-switch link (ISL) failure.
But still the problem is that only one of both host interfaces can be used at a time.
Do I tried to use balance-tlb
instead of active-backup
.
However it seems balance-tlb
does not allow to use arp_ip_target
for link monitoring.
So I wonder:
Is there a solution that provides both, high availability in case of any link failures *and* high bandwidth.
Final note:
===
Conceptually **S1** and **S3** would be connected, too (just as **S2** and **S4**), but for illustration the example is simple enough, I hope.
Also I can configure the hosts, but not the switches.
U. Windl
(1715 rep)
Nov 13, 2020, 08:04 AM
• Last activity: Sep 8, 2022, 09:57 AM
0
votes
0
answers
1408
views
High CPU utilization showing in CPU
We are running Two Node cluster using Redhat Pacemaker running on RHEL 7. Last thursday (3/2/2022) i updated kernel to latest version. And on Friday at 3:49 First node rebooted(Reason unknow) and then rejoined but at time resources were running on Node2. Today i noticed that are is high cpu utilizat...
We are running Two Node cluster using Redhat Pacemaker running on RHEL 7. Last thursday (3/2/2022) i updated kernel to latest version. And on Friday at 3:49 First node rebooted(Reason unknow) and then rejoined but at time resources were running on Node2.
Today i noticed that are is high cpu utilization and top command shows
%Cpu(s): 2.9 us, 89.8 sy, 0.2 ni, 7.1 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
I dont know what process is using 89.8% of cpu
top command
===========
top - 07:10:55 up 4 days, 14:17, 2 users, load average: 8.08, 8.13, 7.98
Tasks: 483 total, 8 running, 475 sleeping, 0 stopped, 0 zombie
%Cpu(s): 2.7 us, 89.7 sy, 0.2 ni, 7.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 39464316+total, 1881036 free, 21074576+used, 18201638+buff/cache
KiB Swap: 93749248 total, 93749248 free, 0 used. 18109798+avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
183327 oracle 20 0 195.5g 56260 42096 R 99.3 0.0 1133:15 oracle_183327_s
183552 oracle 20 0 195.5g 58704 42032 R 99.3 0.0 5626:11 oracle_183552_s
183443 oracle 20 0 195.5g 54488 40728 R 98.8 0.0 1626:54 oracle_183443_s
183554 oracle 20 0 195.5g 57076 41912 R 98.6 0.0 5304:10 oracle_183554_s
183354 oracle 20 0 195.5g 47248 39176 R 97.8 0.0 4847:28 oracle_183354_s
183556 oracle 20 0 195.5g 60040 43456 R 97.8 0.0 2486:30 oracle_183556_s
104734 oracle 20 0 195.5g 48516 39564 R 97.1 0.0 1583:06 oracle_104734_s
142910 root 20 0 162524 2704 1588 R 27.5 0.0 1:43.11 top
4612 root 39 19 13172 9268 480 S 3.8 0.0 255:12.73 apps.plugin
3918 netdata 39 19 251412 137752 2760 S 1.0 0.0 92:52.38 netdata
175736 root 20 0 755216 74556 13932 S 1.0 0.0 64:27.23 guard_stap
183545 oracle 20 0 195.5g 61516 42944 S 1.0 0.0 50:46.33 oracle_183545_s
165271 oracle -2 0 195.4g 18936 15872 S 0.7 0.0 44:31.74 ora_vktm_ssys
183352 oracle 20 0 195.5g 45884 38572 S 0.7 0.0 35:28.20 oracle_183352_s
183550 oracle 20 0 195.5g 52640 42520 S 0.7 0.0 47:01.94 oracle_183550_s
189069 oracle 20 0 195.5g 58344 41844 S 0.7 0.0 38:45.02 oracle_189069_s
3695 root 20 0 916256 131244 18368 S 0.5 0.0 42:22.64 ds_agent
3721 root rt 0 196440 98180 70968 S 0.5 0.0 42:39.31 corosync
69846 oracle 20 0 195.5g 49116 39316 S 0.5 0.0 10:22.26 oracle_69846_ss
183350 oracle 20 0 195.5g 45672 38332 S 0.5 0.0 36:46.71 oracle_183350_s
183356 oracle 20 0 195.5g 45992 38452 S 0.5 0.0 36:24.67 oracle_183356_s
183787 oracle 20 0 195.5g 45428 37976 S 0.5 0.0 2:10.28 oracle_183787_s
198328 oracle 20 0 195.5g 52616 42012 S 0.5 0.0 38:30.80 oracle_198328_s
1471 root 20 0 0 0 0 S 0.2 0.0 0:14.07 MpxPeriodicCall
3822 root 20 0 138468 9392 5696 S 0.2 0.0 4:15.32 stonithd
3962 swiagent 20 0 2342756 14444 6624 S 0.2 0.0 5:07.94 swiagent
4607 netdata 39 19 161488 21948 4312 S 0.2 0.0 16:25.79 python
22089 oracle 20 0 195.4g 26088 21472 S 0.2 0.0 0:01.55 ora_m006_ssys
114147 oracle 20 0 195.4g 41528 34804 S 0.2 0.0 0:00.43 oracle_114147_s
117437 oracle 20 0 195.5g 45332 38108 S 0.2 0.0 5:33.05 oracle_117437_s
135186 root 20 0 3706316 163948 31820 S 0.2 0.0 18:33.37 ds_am
148697 netdata 39 19 1648 1008 616 S 0.2 0.0 0:00.20 bash
152754 root 20 0 477760 4984 3960 S 0.2 0.0 0:00.01 SolarWinds.ADM.
165327 oracle 20 0 195.5g 81356 51292 S 0.2 0.0 1:49.24 ora_mmon_ssys
183783 oracle 20 0 195.5g 44960 37616 S 0.2 0.0 2:12.17 oracle_183783_s
1 root 20 0 191832 4924 2660 S 0.0 0.0 44:03.90 systemd
2 root 20 0 0 0 0 S 0.0 0.0 0:00.13 kthreadd
4 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H
6 root 20 0 0 0 0 S 0.0 0.0 0:26.87 ksoftirqd/0
7 root rt 0 0 0 0 S 0.0 0.0 0:03.91 migration/0
8 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_bh
9 root 20 0 0 0 0 S 0.0 0.0 10:48.66 rcu_sched
10 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 lru-add-drain
11 root rt 0 0 0 0 S 0.0 0.0 0:00.62 watchdog/0
This CPU utilization is increasing since Friday 9AM and was gradually increasing
SAR command (Friday)
====================
sudo sar -u ALL -f /var/log/sa/sa04
Linux 3.10.0-1160.53.1.el7.x86_64 (prod-db2-node2) 02/04/2022 _x86_64_ (8 CPU)
12:00:01 AM CPU %usr %nice %sys %iowait %steal %irq %soft %guest %gnice %idle
12:10:01 AM all 3.54 0.31 2.99 0.04 0.00 0.00 0.02 0.00 0.00 93.10
12:20:01 AM all 3.56 0.31 3.00 0.03 0.00 0.00 0.02 0.00 0.00 93.08
12:30:01 AM all 3.55 0.31 3.04 0.03 0.00 0.00 0.02 0.00 0.00 93.04
12:40:01 AM all 3.62 0.31 3.06 0.03 0.00 0.00 0.02 0.00 0.00 92.96
12:50:01 AM all 3.53 0.31 3.34 0.04 0.00 0.00 0.02 0.00 0.00 92.76
01:00:01 AM all 3.74 0.31 3.08 0.04 0.00 0.00 0.02 0.00 0.00 92.81
01:10:01 AM all 3.88 0.31 3.07 0.08 0.00 0.00 0.03 0.00 0.00 92.64
01:20:01 AM all 3.54 0.31 3.02 0.03 0.00 0.00 0.02 0.00 0.00 93.08
01:30:01 AM all 3.56 0.31 3.04 0.03 0.00 0.00 0.03 0.00 0.00 93.03
01:40:01 AM all 3.55 0.30 3.03 0.03 0.00 0.00 0.02 0.00 0.00 93.07
01:50:01 AM all 3.61 0.31 3.04 0.03 0.00 0.00 0.02 0.00 0.00 92.99
02:00:01 AM all 3.55 0.31 3.05 0.03 0.00 0.00 0.02 0.00 0.00 93.03
02:10:01 AM all 3.60 0.31 3.04 0.04 0.00 0.00 0.02 0.00 0.00 92.99
02:20:01 AM all 3.52 0.31 3.01 0.03 0.00 0.00 0.02 0.00 0.00 93.10
02:30:01 AM all 3.75 0.31 3.05 0.03 0.00 0.00 0.02 0.00 0.00 92.83
02:40:01 AM all 3.52 0.31 3.01 0.03 0.00 0.00 0.02 0.00 0.00 93.11
02:50:01 AM all 3.57 0.31 3.02 0.03 0.00 0.00 0.02 0.00 0.00 93.04
03:00:01 AM all 3.55 0.30 3.02 0.03 0.00 0.00 0.02 0.00 0.00 93.08
03:10:01 AM all 3.59 0.31 3.03 0.04 0.00 0.00 0.02 0.00 0.00 93.00
03:20:01 AM all 3.58 0.31 3.04 0.04 0.00 0.00 0.02 0.00 0.00 93.02
03:30:01 AM all 3.51 0.31 2.99 0.03 0.00 0.00 0.02 0.00 0.00 93.13
03:40:01 AM all 3.57 0.31 3.02 0.03 0.00 0.00 0.02 0.00 0.00 93.05
03:50:01 AM all 3.55 0.34 3.10 0.20 0.00 0.00 0.02 0.00 0.00 92.79
04:00:01 AM all 3.71 0.31 3.04 0.03 0.00 0.00 0.02 0.00 0.00 92.89
04:10:01 AM all 3.54 0.31 3.01 0.03 0.00 0.00 0.02 0.00 0.00 93.08
04:20:01 AM all 3.53 0.31 3.02 0.03 0.00 0.00 0.02 0.00 0.00 93.08
04:30:01 AM all 3.51 0.31 3.01 0.03 0.00 0.00 0.02 0.00 0.00 93.12
04:40:01 AM all 3.57 0.31 3.03 0.03 0.00 0.00 0.03 0.00 0.00 93.03
04:50:01 AM all 3.45 0.31 3.19 0.03 0.00 0.00 0.03 0.00 0.00 93.00
05:00:01 AM all 3.57 0.31 3.05 0.03 0.00 0.00 0.02 0.00 0.00 93.01
05:10:02 AM all 3.56 0.31 3.07 0.03 0.00 0.00 0.02 0.00 0.00 93.00
05:20:01 AM all 3.54 0.31 3.09 0.03 0.00 0.00 0.02 0.00 0.00 93.01
05:30:01 AM all 3.72 0.31 3.08 0.03 0.00 0.00 0.02 0.00 0.00 92.83
05:40:01 AM all 3.54 0.31 3.05 0.03 0.00 0.00 0.02 0.00 0.00 93.05
05:50:01 AM all 3.53 0.31 3.03 0.03 0.00 0.00 0.02 0.00 0.00 93.08
06:00:01 AM all 3.53 0.31 3.03 0.03 0.00 0.00 0.03 0.00 0.00 93.08
06:10:01 AM all 3.61 0.31 3.06 0.03 0.00 0.00 0.02 0.00 0.00 92.97
06:20:01 AM all 3.50 0.31 3.01 0.03 0.00 0.00 0.02 0.00 0.00 93.13
06:20:01 AM CPU %usr %nice %sys %iowait %steal %irq %soft %guest %gnice %idle
06:30:01 AM all 3.58 0.31 3.09 0.03 0.00 0.00 0.03 0.00 0.00 92.97
06:40:01 AM all 3.56 0.31 3.06 0.03 0.00 0.00 0.03 0.00 0.00 93.01
06:50:01 AM all 3.56 0.31 3.07 0.03 0.00 0.00 0.03 0.00 0.00 93.00
07:00:02 AM all 3.70 0.31 3.09 0.03 0.00 0.00 0.03 0.00 0.00 92.85
07:10:01 AM all 3.61 0.31 3.09 0.03 0.00 0.00 0.03 0.00 0.00 92.93
07:20:02 AM all 3.50 0.31 3.07 0.03 0.00 0.00 0.02 0.00 0.00 93.07
07:30:01 AM all 3.59 0.30 3.08 0.03 0.00 0.00 0.03 0.00 0.00 92.97
07:40:01 AM all 3.58 0.31 3.09 0.03 0.00 0.00 0.03 0.00 0.00 92.96
07:50:01 AM all 3.54 0.31 3.06 0.03 0.00 0.00 0.02 0.00 0.00 93.04
08:00:01 AM all 3.55 0.31 3.26 0.03 0.00 0.00 0.02 0.00 0.00 92.82
08:10:01 AM all 3.57 0.31 3.07 0.03 0.00 0.00 0.02 0.00 0.00 93.00
08:20:01 AM all 3.55 0.31 3.08 0.03 0.00 0.00 0.02 0.00 0.00 93.01
08:30:01 AM all 3.69 0.31 3.11 0.03 0.00 0.00 0.02 0.00 0.00 92.84
08:40:01 AM all 3.62 0.31 3.11 0.03 0.00 0.00 0.02 0.00 0.00 92.91
08:50:01 AM all 3.52 0.31 3.06 0.03 0.00 0.00 0.02 0.00 0.00 93.06
09:00:01 AM all 3.28 0.29 15.20 0.03 0.00 0.00 0.04 0.00 0.00 81.16
09:10:01 AM all 3.28 0.29 15.30 0.03 0.00 0.00 0.04 0.00 0.00 81.07
09:20:01 AM all 3.26 0.29 15.33 0.03 0.00 0.00 0.04 0.00 0.00 81.06
09:30:01 AM all 3.23 0.29 15.30 0.03 0.00 0.00 0.04 0.00 0.00 81.12
09:40:01 AM all 3.30 0.28 15.32 0.03 0.00 0.00 0.04 0.00 0.00 81.03
09:50:01 AM all 3.26 0.28 15.29 0.03 0.00 0.00 0.04 0.00 0.00 81.10
10:00:01 AM all 3.38 0.28 15.37 0.03 0.00 0.00 0.04 0.00 0.00 80.90
10:10:01 AM all 3.31 0.28 15.33 0.04 0.00 0.00 0.04 0.00 0.00 81.01
10:20:01 AM all 3.23 0.29 15.33 0.03 0.00 0.00 0.04 0.00 0.00 81.09
10:30:01 AM all 3.28 0.28 15.33 0.03 0.00 0.00 0.04 0.00 0.00 81.04
10:40:01 AM all 3.25 0.29 15.31 0.03 0.00 0.00 0.04 0.00 0.00 81.09
10:50:01 AM all 3.27 0.28 15.33 0.03 0.00 0.00 0.04 0.00 0.00 81.05
11:00:01 AM all 3.21 0.28 15.32 0.03 0.00 0.00 0.04 0.00 0.00 81.12
11:10:01 AM all 3.33 0.29 15.35 0.03 0.00 0.00 0.04 0.00 0.00 80.96
11:20:01 AM all 3.26 0.28 15.32 0.03 0.00 0.00 0.04 0.00 0.00 81.06
11:30:01 AM all 3.44 0.28 15.36 0.03 0.00 0.00 0.04 0.00 0.00 80.85
11:40:01 AM all 3.26 0.29 15.32 0.03 0.00 0.00 0.03 0.00 0.00 81.07
11:50:01 AM all 3.29 0.29 15.33 0.03 0.00 0.00 0.04 0.00 0.00 81.02
12:00:01 PM all 3.29 0.28 15.33 0.03 0.00 0.00 0.04 0.00 0.00 81.02
12:10:01 PM all 3.29 0.29 15.35 0.03 0.00 0.00 0.04 0.00 0.00 81.01
12:20:01 PM all 3.27 0.28 15.35 0.03 0.00 0.00 0.04 0.00 0.00 81.02
12:30:01 PM all 3.25 0.29 15.34 0.03 0.00 0.00 0.04 0.00 0.00 81.06
12:40:01 PM all 3.30 0.28 15.35 0.03 0.00 0.00 0.03 0.00 0.00 80.99
12:40:01 PM CPU %usr %nice %sys %iowait %steal %irq %soft %guest %gnice %idle
12:50:01 PM all 3.25 0.28 15.34 0.03 0.00 0.00 0.04 0.00 0.00 81.06
01:00:01 PM all 3.46 0.29 15.40 0.03 0.00 0.00 0.04 0.00 0.00 80.79
01:10:01 PM all 3.25 0.29 15.34 0.03 0.00 0.00 0.04 0.00 0.00 81.05
01:20:01 PM all 3.30 0.28 15.38 0.03 0.00 0.00 0.04 0.00 0.00 80.98
01:30:01 PM all 3.26 0.28 15.36 0.04 0.00 0.00 0.04 0.00 0.00 81.03
01:40:01 PM all 3.61 0.29 15.41 0.18 0.00 0.00 0.04 0.00 0.00 80.47
01:50:01 PM all 3.24 0.28 15.38 0.03 0.00 0.00 0.04 0.00 0.00 81.03
02:00:01 PM all 3.29 0.28 15.39 0.03 0.00 0.00 0.04 0.00 0.00 80.97
02:10:01 PM all 3.30 0.28 15.38 0.04 0.00 0.00 0.04 0.00 0.00 80.96
02:20:01 PM all 3.14 0.28 20.19 0.03 0.00 0.00 0.04 0.00 0.00 76.32
02:30:02 PM all 3.22 0.28 27.71 0.03 0.00 0.00 0.04 0.00 0.00 68.73
02:40:01 PM all 3.00 0.28 27.66 0.03 0.00 0.00 0.04 0.00 0.00 68.99
02:50:01 PM all 3.06 0.28 27.65 0.03 0.00 0.00 0.04 0.00 0.00 68.94
03:00:01 PM all 3.00 0.28 27.68 0.03 0.00 0.00 0.03 0.00 0.00 68.97
03:10:02 PM all 3.28 0.27 27.70 0.05 0.00 0.00 0.04 0.00 0.00 68.66
03:20:01 PM all 2.99 0.28 27.66 0.03 0.00 0.00 0.04 0.00 0.00 69.00
03:30:01 PM all 3.07 0.28 27.68 0.03 0.00 0.00 0.04 0.00 0.00 68.90
03:40:01 PM all 3.04 0.28 27.67 0.03 0.00 0.00 0.04 0.00 0.00 68.94
03:50:01 PM all 3.04 0.27 27.69 0.03 0.00 0.00 0.04 0.00 0.00 68.93
04:00:01 PM all 3.19 0.28 27.71 0.03 0.00 0.00 0.04 0.00 0.00 68.76
04:10:01 PM all 3.09 0.28 28.14 0.03 0.00 0.00 0.04 0.00 0.00 68.42
04:20:01 PM all 3.04 0.28 27.69 0.03 0.00 0.00 0.03 0.00 0.00 68.92
04:30:01 PM all 3.04 0.28 27.68 0.03 0.00 0.00 0.04 0.00 0.00 68.94
04:40:01 PM all 3.08 0.28 27.72 0.03 0.00 0.00 0.03 0.00 0.00 68.85
04:50:01 PM all 3.01 0.28 27.70 0.03 0.00 0.00 0.04 0.00 0.00 68.95
05:00:01 PM all 3.05 0.28 27.68 0.03 0.00 0.00 0.04 0.00 0.00 68.92
05:10:01 PM all 5.55 0.26 32.05 6.84 0.00 0.00 0.12 0.00 0.00 55.17
05:20:01 PM all 3.05 0.28 27.71 0.03 0.00 0.00 0.03 0.00 0.00 68.89
05:30:01 PM all 3.19 0.28 27.73 0.03 0.00 0.00 0.03 0.00 0.00 68.73
05:40:01 PM all 3.05 0.28 27.70 0.03 0.00 0.00 0.04 0.00 0.00 68.91
05:50:01 PM all 3.03 0.28 27.69 0.03 0.00 0.00 0.04 0.00 0.00 68.93
06:00:01 PM all 3.03 0.28 27.72 0.03 0.00 0.00 0.04 0.00 0.00 68.91
06:10:02 PM all 3.06 0.28 27.72 0.03 0.00 0.00 0.04 0.00 0.00 68.88
06:20:01 PM all 3.07 0.28 27.72 0.03 0.00 0.00 0.03 0.00 0.00 68.87
06:30:01 PM all 3.09 0.28 27.77 0.56 0.00 0.00 0.04 0.00 0.00 68.26
06:40:01 PM all 3.05 0.28 27.74 0.03 0.00 0.00 0.04 0.00 0.00 68.86
06:50:01 PM all 3.07 0.28 27.71 0.03 0.00 0.00 0.04 0.00 0.00 68.87
07:00:01 PM all 3.19 0.28 27.75 0.03 0.00 0.00 0.04 0.00 0.00 68.71
07:00:01 PM CPU %usr %nice %sys %iowait %steal %irq %soft %guest %gnice %idle
07:10:01 PM all 3.14 0.27 27.76 0.03 0.00 0.00 0.03 0.00 0.00 68.76
07:20:01 PM all 3.03 0.28 27.72 0.03 0.00 0.00 0.04 0.00 0.00 68.90
07:30:01 PM all 3.08 0.28 27.73 0.03 0.00 0.00 0.04 0.00 0.00 68.84
07:40:01 PM all 3.06 0.28 27.73 0.03 0.00 0.00 0.04 0.00 0.00 68.87
07:50:01 PM all 3.05 0.28 27.73 0.03 0.00 0.00 0.04 0.00 0.00 68.87
08:00:01 PM all 3.03 0.27 27.74 0.03 0.00 0.00 0.03 0.00 0.00 68.89
08:10:01 PM all 3.10 0.27 27.76 0.03 0.00 0.00 0.04 0.00 0.00 68.79
08:20:01 PM all 3.03 0.28 27.73 0.03 0.00 0.00 0.04 0.00 0.00 68.90
08:30:01 PM all 3.23 0.28 27.77 0.03 0.00 0.00 0.03 0.00 0.00 68.66
08:40:01 PM all 3.08 0.28 27.75 0.03 0.00 0.00 0.03 0.00 0.00 68.82
08:50:01 PM all 3.04 0.28 27.74 0.03 0.00 0.00 0.04 0.00 0.00 68.88
09:00:01 PM all 3.08 0.28 27.76 0.03 0.00 0.00 0.04 0.00 0.00 68.81
09:10:01 PM all 3.07 0.28 27.77 0.03 0.00 0.00 0.04 0.00 0.00 68.81
09:20:01 PM all 3.07 0.28 27.76 0.03 0.00 0.00 0.04 0.00 0.00 68.81
09:30:01 PM all 3.04 0.28 27.74 0.03 0.00 0.00 0.04 0.00 0.00 68.87
09:40:01 PM all 3.09 0.28 27.77 0.03 0.00 0.00 0.04 0.00 0.00 68.79
09:50:01 PM all 3.04 0.28 27.77 0.03 0.00 0.00 0.03 0.00 0.00 68.85
10:00:01 PM all 3.21 0.26 36.38 0.04 0.00 0.00 0.03 0.00 0.00 60.08
10:10:01 PM all 7.59 0.25 40.00 0.15 0.00 0.00 0.04 0.00 0.00 51.98
10:20:01 PM all 2.98 0.26 40.02 0.03 0.00 0.00 0.03 0.00 0.00 56.68
10:30:01 PM all 2.98 0.25 40.02 0.04 0.00 0.00 0.03 0.00 0.00 56.67
10:40:01 PM all 3.00 0.25 40.03 0.03 0.00 0.00 0.03 0.00 0.00 56.65
10:50:01 PM all 2.97 0.26 40.05 0.03 0.00 0.00 0.03 0.00 0.00 56.65
11:00:01 PM all 2.92 0.26 40.03 0.03 0.00 0.00 0.04 0.00 0.00 56.72
11:10:01 PM all 3.03 0.25 40.08 0.03 0.00 0.00 0.03 0.00 0.00 56.57
11:20:01 PM all 2.95 0.26 40.03 0.03 0.00 0.00 0.03 0.00 0.00 56.70
11:30:01 PM all 3.14 0.26 40.06 0.03 0.00 0.00 0.03 0.00 0.00 56.47
11:40:01 PM all 2.97 0.26 40.05 0.03 0.00 0.00 0.03 0.00 0.00 56.67
11:50:01 PM all 2.99 0.26 40.06 0.03 0.00 0.00 0.03 0.00 0.00 56.63
Average: all 3.36 0.29 16.80 0.09 0.00 0.00 0.03 0.00 0.00 79.43
It started increase exactly at nine AM on Friday
SAR command (Latest)
====================
sudo sar -u
Linux 3.10.0-1160.53.1.el7.x86_64 (prod-db2-node2) 02/08/2022 _x86_64_ (8 CPU)
12:00:01 AM CPU %user %nice %system %iowait %steal %idle
12:10:02 AM all 1.54 0.21 88.00 0.02 0.00 10.23
12:20:01 AM all 1.50 0.22 87.99 0.01 0.00 10.28
12:30:01 AM all 1.47 0.21 87.97 0.01 0.00 10.34
12:40:01 AM all 1.48 0.22 87.98 0.01 0.00 10.31
12:50:01 AM all 1.47 0.21 88.00 0.01 0.00 10.30
01:00:01 AM all 1.70 0.22 87.98 0.01 0.00 10.10
01:10:01 AM all 1.93 0.21 87.94 0.02 0.00 9.90
01:20:01 AM all 1.50 0.22 88.00 0.01 0.00 10.27
01:30:02 AM all 1.51 0.21 87.97 0.01 0.00 10.29
01:40:01 AM all 1.51 0.21 87.95 0.01 0.00 10.32
01:50:01 AM all 1.46 0.21 87.96 0.02 0.00 10.35
02:00:02 AM all 1.49 0.22 87.95 0.01 0.00 10.32
02:10:02 AM all 1.53 0.22 87.93 0.01 0.00 10.31
02:20:02 AM all 1.44 0.22 87.95 0.01 0.00 10.38
02:30:01 AM all 1.70 0.21 87.94 0.02 0.00 10.13
02:40:01 AM all 1.44 0.21 87.95 0.02 0.00 10.38
02:50:01 AM all 1.47 0.21 87.97 0.01 0.00 10.34
03:00:02 AM all 1.43 0.21 87.94 0.01 0.00 10.40
03:10:01 AM all 1.50 0.21 87.96 0.01 0.00 10.31
03:20:01 AM all 1.51 0.23 87.97 0.01 0.00 10.28
03:30:02 AM all 1.48 0.21 87.93 0.01 0.00 10.36
03:40:02 AM all 1.47 0.22 87.94 0.02 0.00 10.35
03:50:01 AM all 1.44 0.22 87.95 0.01 0.00 10.38
04:00:01 AM all 1.64 0.21 87.94 0.02 0.00 10.19
04:10:01 AM all 1.52 0.22 87.92 0.02 0.00 10.33
04:20:01 AM all 1.45 0.22 87.92 0.02 0.00 10.40
04:30:02 AM all 1.43 0.21 87.95 0.02 0.00 10.39
04:40:02 AM all 1.48 0.22 87.95 0.02 0.00 10.33
04:50:01 AM all 1.41 0.22 87.97 0.02 0.00 10.39
05:00:01 AM all 1.48 0.22 87.94 0.02 0.00 10.35
05:10:01 AM all 1.53 0.21 87.95 0.02 0.00 10.29
05:20:01 AM all 1.45 0.22 87.96 0.01 0.00 10.36
05:30:02 AM all 1.65 0.21 87.92 0.01 0.00 10.20
05:40:01 AM all 1.49 0.21 87.94 0.01 0.00 10.35
05:50:01 AM all 1.43 0.21 87.95 0.01 0.00 10.40
06:00:01 AM all 1.47 0.21 87.93 0.01 0.00 10.38
06:10:01 AM all 1.50 0.22 87.94 0.01 0.00 10.34
06:20:01 AM all 1.44 0.22 87.96 0.01 0.00 10.38
06:30:01 AM all 1.47 0.21 87.93 0.01 0.00 10.37
06:40:01 AM all 1.43 0.21 87.94 0.01 0.00 10.40
06:50:01 AM all 1.44 0.22 87.94 0.01 0.00 10.39
07:00:01 AM all 1.75 0.22 88.04 0.01 0.00 9.98
07:10:01 AM all 2.27 0.21 88.86 0.01 0.00 8.65
Average: all 1.53 0.21 87.98 0.01 0.00 10.27
VMSTAT Command
==============
vmstat 1 -w
procs -----------------------memory---------------------- ---swap-- -----io---- -system-- --------cpu--------
r b swpd free buff cache si so bi bo in cs us sy id wa st
7 0 0 4380564 827180 179070080 0 0 231 41 11 9 3 48 48 0 0
7 0 0 4364796 827180 179070080 0 0 0 160 9274 3727 1 88 10 0 0
7 0 0 4359876 827184 179070080 0 0 0 176 9180 3915 1 88 11 0 0
7 0 0 4359372 827184 179070096 0 0 1664 36 9201 3607 1 88 11 0 0
7 0 0 4351796 827184 179070112 0 0 6656 156 9392 4170 1 89 10 0 0
7 0 0 4361172 827184 179070096 0 0 1664 208 9352 4380 1 89 10 0 0
7 0 0 4360752 827184 179070096 0 0 0 48 9179 3496 0 88 12 0 0
7 0 0 4362452 827184 179070096 0 0 0 12 9281 4572 1 89 9 0 0
7 0 0 4363568 827184 179070096 0 0 0 124 9197 3497 0 88 12 0 0
8 0 0 4364952 827184 179070096 0 0 0 140 9189 3682 0 88 11 0 0
7 0 0 4364640 827184 179070096 0 0 0 88 9195 3556 0 88 12 0 0
I checked for the logs and there nothing i could find that could cause the high cpu utilization
Now i know that TOP command is showing processes are related to Oracle DB. But Category shows that 89.8 in SystemSpace and not in UserSpace.
Any advice on how to get what caused this spike
Thanks
OmiPenguin
(4398 rep)
Feb 8, 2022, 04:52 AM
• Last activity: Feb 9, 2022, 05:12 AM
0
votes
0
answers
459
views
HA-Cluster / corosync / pacemaker: Active-Active cluster with service ip / service ip is not switching
How to configure crm to migrate the ServiceIP if one Service is failed? node 1: web01a \ attributes standby=off node 2: web01b \ attributes standby=off primitive Apache2 systemd:apache2 \ operations $id=Apache2-operations \ op start interval=0 timeout=100 \ op stop interval=0 timeout=100 \ op monito...
How to configure crm to migrate the ServiceIP if one Service is failed?
node 1: web01a \
attributes standby=off
node 2: web01b \
attributes standby=off
primitive Apache2 systemd:apache2 \
operations $id=Apache2-operations \
op start interval=0 timeout=100 \
op stop interval=0 timeout=100 \
op monitor interval=15 timeout=100 start-delay=15 \
meta
primitive PHP-FPM systemd:php7.4-fpm \
operations $id=PHP-FPM-operations \
op start interval=0 timeout=100 \
op stop interval=0 timeout=100 \
op monitor interval=15 timeout=100 start-delay=15 \
meta
primitive Redis systemd:redis-server \
operations $id=Redis-operations \
op start interval=0 timeout=100 \
op stop interval=0 timeout=100 \
op monitor interval=15 timeout=100 start-delay=15 \
meta
primitive ServiceIP IPaddr2 \
params ip=1.2.3.4 \
operations $id=ServiceIP-operations \
op monitor interval=10 timeout=20 start-delay=0 \
op_params migration-threshold=1 \
meta
primitive lsyncd systemd:lsyncd \
op start interval=0 timeout=100 \
op stop interval=0 timeout=100 \
op monitor interval=15 timeout=100 start-delay=15 \
meta target-role=Started
group ActiveNode ServiceIP lsyncd
group WebServer Apache2 PHP-FPM Redis
clone cl_WS WebServer \
meta clone-max=2 notify=true interleave=true
colocation col_cl_WS_ActiveNode 100: cl_WS ActiveNode
property cib-bootstrap-options: \
have-watchdog=false \
dc-version=2.0.3-4b1f869f0f \
cluster-infrastructure=corosync \
cluster-name=debian \
stonith-enabled=false \
no-quorum-policy=ignore \
startup-fencing=false \
maintenance-mode=false \
last-lrm-refresh=1622628525 \
start-failure-is-fatal=true
These services should always be started
- Apache2
- PHP-FPM
- Redis
If one of these services is not running, the node is unhelthy.
The **ServiceIP** and **lsyncd** should switch to an healthy node.
When I killed the apache2 process, the IP is not switched.
FaxMax
(726 rep)
Jun 2, 2021, 12:29 PM
1
votes
0
answers
143
views
Stop a pacemaker node when local shell script returns an error?
Is it possible to make pacemaker stopping a node when a local test script fails, and start a node if the local test script returns true again? This seems like a very simple problem, but as i can't find ANY way to do this within pacemaker, I'm about to run the following shell script on all my nodes:...
Is it possible to make pacemaker stopping a node when a local test script fails, and start a node if the local test script returns true again?
This seems like a very simple problem, but as i can't find ANY way to do this within pacemaker, I'm about to run the following shell script on all my nodes:
while true; do
pcs status 2>/dev/null >/dev/null && node_running=true
/is_node_healthy.sh && node_healthy=true
[[ -v node_running ]] && ! [[ -v node_healthy ]] && pcs cluster stop
[[ -v node_healthy ]] && ! [[ -v node_running ]] && pcs cluster start
unset node_running node_healthy
sleep 10
done
This does exactly what i want, but looks like a very dirty hack in my eyes. Is there a more elegant way to get the same thing done by pacemaker itself?
BTW: The overall task i want to solve seems quite simple: create a ha cluster that has a public ip address assigned to a vital host, where vitality can be checked with /is_node_healthy.sh
psicolor
(11 rep)
Feb 22, 2021, 11:54 AM
1
votes
1
answers
393
views
fence_virtualbox failed to reboot
I’m learning how to fence pacemaker using fence_virtualbox from [\[ClusterLabs\] Fence agent for VirtualBox][1], but I can’t get it working. When I try to run `stonith_admin –-reboot ` it failed. Currently, my setup is: Node ID: VM name: orcllinux1 OL7 orcllinux2 OL7_2 I set it up using: `pcs stonit...
I’m learning how to fence pacemaker using fence_virtualbox from [\[ClusterLabs\] Fence agent for VirtualBox][1] , but I can’t get it working. When I try to run
I tried to use the fence_virtualbox manually using:
fence_virtualbox -s 192.168.57.1 -p OL7 -o=reboot
and it succeeded.
Is my stonith create syntax wrong? What's the right syntax if it's wrong?
stonith_admin –-reboot
it failed.
Currently, my setup is:
Node ID: VM name:
orcllinux1 OL7
orcllinux2 OL7_2
I set it up using:
pcs stonith create fence_vbox fence_virtualbox pcmk_host_map=”orcllinux1:OL7,orcllinux2:OL7_2” pcmk_host_list=”orcllinux1,orcllinux2” pcmk_host_check=static_list ipaddr=”192.168.57.1” login=”root”
But stonith_admin –-reboot
resulting in this error:

Christophorus Reyhan
(33 rep)
Jan 8, 2021, 11:16 AM
• Last activity: Feb 16, 2021, 03:51 AM
2
votes
1
answers
5355
views
Pacemaker - Corosync - HA - Simple Custom Resource Testing - Status flapping - Started - Failed - Stopped - Started
I am testing using the OCF:Heartbeat:Dummy script and I want to make a very basic setup just to know it works and build on that. The only information I can find was this web blog here. https://raymii.org/s/tutorials/Corosync_Pacemaker_-_Execute_a_script_on_failover.html It has some typos but basical...
I am testing using the OCF:Heartbeat:Dummy script and I want to make a very basic setup just to know it works and build on that.
The only information I can find was this web blog here.
https://raymii.org/s/tutorials/Corosync_Pacemaker_-_Execute_a_script_on_failover.html
It has some typos but basically worked for me.
The script currently just contains the following :
sudo nano /usr/local/bin/failover.sh && sudo chmod +x /usr/local/bin/failover.sh
#!/bin/sh
touch /tmp/testfailover.sh
Here is my setup :
cp /usr/lib/ocf/resource.d/heartbeat/Dummy /usr/lib/ocf/resource.d/heartbeat/FailOverScript
sudo nano /usr/lib/ocf/resource.d/heartbeat/FailOverScript
dummy_start() {
dummy_monitor
/usr/local/bin/failover.sh
if [ $? = $OCF_SUCCESS ]; then
return $OCF_SUCCESS
fi
touch ${OCF_RESKEY_state}
}
sed -i 's/Dummy/FailOverScript/g' /usr/lib/ocf/resource.d/heartbeat/FailOverScript
sed -i 's/dummy/FailOverScript/g' /usr/lib/ocf/resource.d/heartbeat/FailOverScript
pcs resource create FailOverScript ocf:heartbeat:FailOverScript op monitor interval="30"
The only testing I can really do :
[root@node2 ~]# /usr/lib/ocf/resource.d/heartbeat/FailOverScript start ; echo $?
DEBUG: default start : 0
0
ocf-tester doesn't seem to exist in the latest HA Software Suite, not really sure how to manually install it, but the script "half works".
**The script doesn't need monitoring, its supposed to be very basic, but it seems to be flapping and giving me the following error code. Any idea's what to do?**
FailOverScript (ocf::heartbeat:FailOverScript): Started
node2
Failed Actions:
* FailOverScript_monitor_30000 on node2 'not running' (7): call=
24423, status=complete, exitreason='none',
last-rc-change='Tue Aug 16 15:53:50 2016', queued=0ms, exec=
9ms
**Example of what I want to do:**
Cluster start
Script runs "start.sh"
Cluster fails over to node2.
On node1 script runs "fail.sh"
On node2 script runs "start.sh"
and vis versa if it fails the other direction.
Note: The script does work, I get /tmp/testfailover.sh. I even tried putting another script under dummy_stop to remove the file and that worked, but it just keeps flapping along removing/adding/removing/adding file and starting/failing/stoping/starting etc etc.
Thanks for reading!
FreeSoftwareServers
(2682 rep)
Aug 16, 2016, 07:56 PM
• Last activity: Dec 21, 2020, 06:56 AM
0
votes
1
answers
1741
views
Pacemaker apache resource is Failed to access httpd status page after change to HTTPS
I get this error from pacemaker after i change apache from http to https. now my ocf::heartbeat:apache resource is not find status page. I generate SSL certificate separately for 3 servers. Everything was working fine when running on http but as soon as I added the (self-signed) SSL certificate pace...
I get this error from pacemaker after i change apache from http to https.
now my ocf::heartbeat:apache resource is not find status page.
I generate SSL certificate separately for 3 servers.
Everything was working fine when running on http but as soon as I added the (self-signed) SSL certificate
pacemaker
Apache (ocf::heartbeat:apache): Stopped
And error shows
Failed Actions:
* Apache_start_0 on server3 'unknown error' (1): call=315, status=complete, exitreason='Failed to access httpd status page.',
last-rc-change='Mon Sep 21 16:22:37 2020', queued=0ms, exec=3456ms
* Apache_start_0 on server1 'unknown error' (1): call=59, status=complete, exitreason='Failed to access httpd status page.',
last-rc-change='Mon Sep 21 16:22:41 2020', queued=0ms, exec=3421ms
* Apache_start_0 on server2 'unknown error' (1): call=197, status=complete, exitreason='Failed to access httpd status page.',
last-rc-change='Mon Sep 21 16:22:33 2020', queued=0ms, exec=3451ms
/etc/apache2/sites-available/000-default.conf
ServerAdmin webmaster@localhost
DocumentRoot /var/www/html
Redirect "/" "https://10.226.***.***/ "
SetHandler server-status ServerAdmin webmaster@localhost
DocumentRoot /var/www/html
Redirect "/" "https://10.226.179.205/ "
Order deny,allow
Deny from all
Allow from 127.0.0.1
*pcs resource debug-monitor --full Apache*
Operation monitor for Apache (ocf:heartbeat:apache) returned 1
> stderr: + echo
> stderr: + printenv
> stderr: + sort
> stderr: + env=
> stderr: AONIX_LM_DIR=/home/TeleUSE/etc
> stderr: BXwidgets=/home/BXwidgets
> stderr: HA_logfacility=none
> stderr: HOME=/root
> stderr: LC_ALL=C
> stderr: LOGNAME=root
> stderr: MAIL=/var/mail/root
> stderr: OCF_EXIT_REASON_PREFIX=ocf-exit-reason:
> stderr: OCF_RA_VERSION_MAJOR=1
> stderr: OCF_RA_VERSION_MINOR=0
> stderr: OCF_RESKEY_CRM_meta_class=ocf
> stderr: OCF_RESKEY_CRM_meta_id=Apache
> stderr: OCF_RESKEY_CRM_meta_migration_threshold=5
> stderr: OCF_RESKEY_CRM_meta_provider=heartbeat
> stderr: OCF_RESKEY_CRM_meta_resource_stickiness=10
> stderr: OCF_RESKEY_CRM_meta_type=apache
> stderr: OCF_RESKEY_configfile=/etc/apache2/apache2.conf
> stderr: OCF_RESKEY_statusurl=http://localhost/server-status
> stderr: OCF_RESOURCE_INSTANCE=Apache
> stderr: OCF_RESOURCE_PROVIDER=heartbeat
> stderr: OCF_RESOURCE_TYPE=apache
> stderr: OCF_ROOT=/usr/lib/ocf
> stderr: OCF_TRACE_RA=1
> stderr: PATH=/root/.rbenv/shims:/root/.rbenv/bin:/root/.rbenv/shims:/root/.rbenv/bin:/usr/local/bin:/home/TeleUSE/bin:/home/xrt/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/sbin:/bin:/usr/sbin:/usr/bin:/usr/ucb
> stderr: PCMK_logfacility=none
> stderr: PCMK_service=crm_resource
> stderr: PWD=/root
> stderr: RBENV_SHELL=bash
> stderr: SHELL=/bin/bash
> stderr: SHLVL=1
> stderr: SSH_CLIENT=10.12.116.46 63097 22
> stderr: SSH_CONNECTION=10.12.116.46 63097 10.226.179.205 22
> stderr: SSH_TTY=/dev/pts/0
> stderr: TERM=xterm
> stderr: TeleUSE=/home/TeleUSE
> stderr: USER=root
> stderr: _=/usr/sbin/pcs
> stderr: __OCF_TRC_DEST=
> stderr: __OCF_TRC_MANAGE=
> stderr: + ocf_is_true
> stderr: + false
> stderr: + . /usr/lib/ocf/lib/heartbeat/apache-conf.sh
> stderr: + . /usr/lib/ocf/lib/heartbeat/http-mon.sh
> stderr: + bind_address=127.0.0.1
> stderr: + curl_ipv6_opts=
> stderr: + ocf_is_true
> stderr: + false
> stderr: + echo
> stderr: + grep -qs ::
> stderr: + WGETOPTS=-O- -q -L --no-proxy --bind-address=127.0.0.1
> stderr: + CURLOPTS=-o - -Ss -L --interface lo
> stderr: + HA_VARRUNDIR=/var/run
> stderr: + IBMHTTPD=/opt/IBMHTTPServer/bin/httpd
> stderr: + HTTPDLIST=/sbin/httpd2 /usr/sbin/httpd2 /usr/sbin/apache2 /sbin/httpd /usr/sbin/httpd /usr/sbin/apache /opt/IBMHTTPServer/bin/httpd
> stderr: + MPM=/usr/share/apache2/find_mpm
> stderr: + [ -x /usr/share/apache2/find_mpm ]
> stderr: + LOCALHOST=http://localhost
> stderr: + HTTPDOPTS=-DSTATUS
> stderr: + DEFAULT_IBMCONFIG=/opt/IBMHTTPServer/conf/httpd.conf
> stderr: + DEFAULT_SUSECONFIG=/etc/apache2/httpd.conf
> stderr: + DEFAULT_RHELCONFIG=/etc/httpd/conf/httpd.conf
> stderr: + DEFAULT_DEBIANCONFIG=/etc/apache2/apache2.conf
> stderr: + basename /usr/lib/ocf/resource.d/heartbeat/apache
> stderr: + CMD=apache
> stderr: + OCF_REQUIRED_PARAMS=
> stderr: + OCF_REQUIRED_BINARIES=
> stderr: + ocf_rarun monitor
> stderr: + mk_action_func
> stderr: + echo apache_monitor
> stderr: + tr - _
> stderr: + ACTION_FUNC=apache_monitor
> stderr: + validate_args
> stderr: + is_function apache_monitor
> stderr: + command -v apache_monitor
> stderr: + test zapache_monitor = zapache_monitor
> stderr: + simple_actions
> stderr: + check_required_params
> stderr: + local v
> stderr: + run_function apache_getconfig
> stderr: + is_function apache_getconfig
> stderr: + command -v apache_getconfig
> stderr: + test zapache_getconfig = zapache_getconfig
> stderr: + apache_getconfig
> stderr: + HTTPD=
> stderr: + PORT=
> stderr: + STATUSURL=http://localhost/server-status
> stderr: + CONFIGFILE=/etc/apache2/apache2.conf
> stderr: + OPTIONS=
> stderr: + CLIENT=
> stderr: + TESTREGEX=
> stderr: + TESTURL=
> stderr: + TESTREGEX10=
> stderr: + TESTCONFFILE=
> stderr: + TESTNAME=
> stderr: + : /etc/apache2/envvars
> stderr: + source_envfiles /etc/apache2/envvars
> stderr: + [ -f /etc/apache2/envvars -a -r /etc/apache2/envvars ]
> stderr: + . /etc/apache2/envvars
> stderr: + unset HOME
> stderr: + [ != ]
> stderr: + SUFFIX=
> stderr: + export APACHE_RUN_USER=www-data
> stderr: + export APACHE_RUN_GROUP=www-data
> stderr: + export APACHE_PID_FILE=/var/run/apache2/apache2.pid
> stderr: + export APACHE_RUN_DIR=/var/run/apache2
> stderr: + export APACHE_LOCK_DIR=/var/lock/apache2
> stderr: + export APACHE_LOG_DIR=/var/log/apache2
> stderr: + export LANG=C
> stderr: + export LANG
> stderr: + [ X = X -o ! -f -o ! -x ]
> stderr: + find_httpd_prog
> stderr: + HTTPD=
> stderr: + [ -f /sbin/httpd2 -a -x /sbin/httpd2 ]
> stderr: + [ -f /usr/sbin/httpd2 -a -x /usr/sbin/httpd2 ]
> stderr: + [ -f /usr/sbin/apache2 -a -x /usr/sbin/apache2 ]
> stderr: + HTTPD=/usr/sbin/apache2
> stderr: + break
> stderr: + [ X != X -a X/usr/sbin/apache2 != X ]
> stderr: + detect_default_config
> stderr: + [ -f /etc/apache2/httpd.conf ]
> stderr: + [ -f /etc/apache2/apache2.conf ]
> stderr: + echo /etc/apache2/apache2.conf
> stderr: + DefaultConfig=/etc/apache2/apache2.conf
> stderr: + CONFIGFILE=/etc/apache2/apache2.conf
> stderr: + [ -n /usr/sbin/apache2 ]
> stderr: + basename /usr/sbin/apache2
> stderr: + httpd_basename=apache2
> stderr: + GetParams /etc/apache2/apache2.conf
> stderr: + ConfigFile=/etc/apache2/apache2.conf
> stderr: + [ ! -f /etc/apache2/apache2.conf ]
> stderr: + get_apache_params /etc/apache2/apache2.conf ServerRoot PidFile Port Listen
> stderr: + configfile=/etc/apache2/apache2.conf
> stderr: + shift 1
> stderr: + echo ServerRoot PidFile Port Listen
> stderr: + sed s/ /,/g
> stderr: + vars=ServerRoot,PidFile,Port,Listen
> stderr: + apachecat /etc/apache2/apache2.conf
> stderr: + awk -v vars=ServerRoot,PidFile,Port,Listen
> stderr: BEGIN{
> stderr: split(vars,v,",");
> stderr: for( i in v )
> stderr: vl[i]=tolower(v[i]);
> stderr: }
> stderr: {
> stderr: for( i in v )
> stderr: if( tolower($1)==vl[i] ) {
> stderr: print v[i]"="$2
> stderr: delete vl[i]
> stderr: break
> stderr: }
> stderr: }
> stderr:
> stderr: + awk
> stderr: function procline() {
> stderr: split($0,a);
> stderr: if( a~/^[Ii]nclude$/ ) {
> stderr: includedir=a;
> stderr: gsub("\"","",includedir);
> stderr: procinclude(includedir);
> stderr: } else {
> stderr: if( a=="ServerRoot" ) {
> stderr: rootdir=a;
> stderr: gsub("\"","",rootdir);
> stderr: }
> stderr: print;
> stderr: }
> stderr: }
> stderr: function printfile(infile, a) {
> stderr: while( (getline 0 ) {
> stderr: procline();
> stderr: }
> stderr: close(infile);
> stderr: }
> stderr: function allfiles(dir, cmd,f) {
> stderr: cmd="find -L "dir" -type f";
> stderr: while( ( cmd | getline f ) > 0 ) {
> stderr: printfile(f);
> stderr: }
> stderr: close(cmd);
> stderr: }
> stderr: function listfiles(pattern, cmd,f) {
> stderr: cmd="ls "pattern" 2>/dev/null";
> stderr: while( ( cmd | getline f ) > 0 ) {
> stderr: printfile(f);
> stderr: }
> stderr: close(cmd);
> stderr: }
> stderr: function procinclude(spec) {
> stderr: if( rootdir!="" && spec!~/^\// ) {
> stderr: spec=rootdir"/"spec;
> stderr: }
> stderr: if( isdir(spec) ) {
> stderr: allfiles(spec); # read all files in a directory (and subdirs)
> stderr: } else {
> stderr: listfiles(spec); # there could be jokers
> stderr: }
> stderr: }
> stderr: function isdir(s) {
> stderr: return !system("test -d \""s"\"");
> stderr: }
> stderr: { procline(); }
> stderr: /etc/apache2/apache2.conf
> stderr: + sed s/#.*//;s/[[:blank:]]*$//;s/^[[:blank:]]*//
> stderr: + grep -v ^$
> stderr: + eval PidFile=${APACHE_PID_FILE}
> stderr: + PidFile=/var/run/apache2/apache2.pid
> stderr: + CheckPort
> stderr: + ocf_is_decimal
> stderr: + false
> stderr: + CheckPort
> stderr: + ocfError performing operation: Operation not permitted
_is_decimal
> stderr: + false
> stderr: + CheckPort 80
> stderr: + ocf_is_decimal 80
> stderr: + true
> stderr: + [ 80 -gt 0 ]
> stderr: + PORT=80
> stderr: + break
> stderr: + echo
> stderr: + grep :
> stderr: + Listen=localhost:
> stderr: + [ Xhttp://localhost/server-status = X ]
> stderr: + test /var/run/apache2/apache2.pid
> stderr: + return 0
> stderr: + validate_env
> stderr: + check_required_binaries
> stderr: + local v
> stderr: + is_function apache_validate_all
> stderr: + command -v apache_validate_all
> stderr: + test zapache_validate_all = zapache_validate_all
> stderr: + local rc
> stderr: + LSB_STATUS_STOPPED=3
> stderr: + apache_validate_all
> stderr: + [ -z /usr/sbin/apache2 ]
> stderr: + [ ! -x /usr/sbin/apache2 ]
> stderr: + [ ! -f /etc/apache2/apache2.conf ]
> stderr: + [ -n ]
> stderr: + [ -n ]
> stderr: + dirname /var/run/apache2/apache2.pid
> stderr: + local a
> stderr: + local b
> stderr: + [ 1 = 1 ]
> stderr: + a=/var/run/apache2/apache2.pid
> stderr: + [ 1 ]
> stderr: + b=/var/run/apache2/apache2.pid
> stderr: + [ /var/run/apache2/apache2.pid = /var/run/apache2/apache2.pid ]
> stderr: + break
> stderr: + b=/var/run/apache2
> stderr: + [ -z /var/run/apache2 -o /var/run/apache2/apache2.pid = /var/run/apache2 ]
> stderr: + echo /var/run/apache2
> stderr: + return 0
> stderr: + ocf_mkstatedir root 755 /var/run/apache2
> stderr: + local owner
> stderr: + local perms
> stderr: + local path
> stderr: + owner=root
> stderr: + perms=755
> stderr: + path=/var/run/apache2
> stderr: + test -d /var/run/apache2
> stderr: + return 0
> stderr: + return 0
> stderr: + rc=0
> stderr: + [ 0 -ne 0 ]
> stderr: + ocf_is_probe
> stderr: + [ monitor = monitor -a 0 = 0 ]
> stderr: + run_probe
> stderr: + is_function apache_probe
> stderr: + command -v apache_probe
> stderr: + test z = zapache_probe
> stderr: + shift 1
> stderr: + apache_monitor
> stderr: + silent_status
> stderr: + local pid
> stderr: + get_pid
> stderr: + [ -f /var/run/apache2/apache2.pid ]
> stderr: + cat /var/run/apache2/apache2.pid
> stderr: + pid=17552
> stderr: + [ -n 17552 ]
> stderr: + ProcessRunning 17552
> stderr: + local pid=17552
> stderr: + [ -d /proc -a -d /proc/1 ]
> stderr: + [ -d /proc/17552 ]
> stderr: + [ 0 -ne 0 ]
> stderr: + findhttpclient
> stderr: + [ x != x ]
> stderr: + which wget
> stderr: + echo wget
> stderr: + ourhttpclient=wget
> stderr: + [ -z wget ]
> stderr: + ocf_check_level 10
> stderr: + local lvl prev
> stderr: + lvl=0
> stderr: + prev=0
> stderr: + ocf_is_decimal 0
> stderr: + true
> stderr: + [ 10 -eq 0 ]
> stderr: + [ 10 -gt 0 ]
> stderr: + lvl=0
> stderr: + break
> stderr: + echo 0
> stderr: + apache_monitor_basic
> stderr: + wget_func http://localhost/server-status
> stderr: + auth=
> stderr: + cl_opts=-O- -q -L --no-proxy --bind-address=127.0.0.1
> stderr: + [ x !=+ x ]
> stderr: grep+ wget -Ei -O- -q
> stderr: -L --no-proxy --bind-address=127.0.0.1 http://localhost/server-status
> stderr: + attempt_index_monitor_request
> stderr: + local indexpage=
> stderr: + [ -n ]
> stderr: + [ -n ]
> stderr: + [ -n ]
> stderr: + [ -n http://localhost/server-status ]
> stderr: + return 1
> stderr: + [ 1 -eq 0 ]
> stderr: + ocf_is_probe
> stderr: + [ monitor = monitor -a 0 = 0 ]
> stderr: + return 1
**pcs config**
Resource: MasterVip (class=ocf provider=heartbeat type=IPaddr2)
Attributes: ip=10.226.***.*** nic=lo cidr_netmask=32 iflabel=pgrepvip
Meta Attrs: target-role=Started
Operations: start interval=0s timeout=20s (MasterVip-start-interval-0s)
stop interval=0s timeout=20s (MasterVip-stop-interval-0s)
monitor interval=90s (MasterVip-monitor-interval-90s)
Resource: Apache (class=ocf provider=heartbeat type=apache)
Attributes: configfile=/etc/apache2/apache2.conf statusurl=http://localhost/server-status
Operations: start interval=0s timeout=40s (Apache-start-interval-0s)
stop interval=0s timeout=60s (Apache-stop-interval-0s)
monitor interval=1min (Apache-monitor-interval-1min)
I don't know how to fix this. if anyone knows please help me.
Karippery
(1 rep)
Sep 21, 2020, 03:04 PM
• Last activity: Sep 22, 2020, 11:36 AM
1
votes
2
answers
4955
views
Keepalived not working?
I'm trying to create HA for HAProxy using keepalived on CentOS 8, here's what I have: Virtual IP: 10.10.10.14 HAProxy Server 1: 10.10.10.15 HAProxy Server 2: 10.10.10.18 and my keepalived configuration on **MASTER**: vrrp_script chk_haproxy { script "killall -0 haproxy" # check the haproxy process i...
I'm trying to create HA for HAProxy using keepalived on CentOS 8, here's what I have:
Virtual IP: 10.10.10.14
HAProxy Server 1: 10.10.10.15
HAProxy Server 2: 10.10.10.18
and my keepalived configuration on **MASTER**:
vrrp_script chk_haproxy {
script "killall -0 haproxy" # check the haproxy process
interval 2 # every 2 seconds
weight 2 # add 2 points if OK
}
vrrp_instance VI_1 {
interface ens190
state MASTER
virtual_router_id 51
priority 101
virtual_ipaddress {
10.10.10.14
}
track_script {
chk_haproxy
}
}
Keepalived config on **BACKUP**:
vrrp_script chk_haproxy {
script "killall -0 haproxy" # check the haproxy process
interval 2 # every 2 seconds
weight 2 # add 2 points if OK
}
vrrp_instance VI_1 {
interface ens165
state BACKUP
virtual_router_id 51
priority 100
virtual_ipaddress {
10.10.10.14
}
track_script {
chk_haproxy
}
}
But every time I try to stop my HAProxy process it won't connect to the backup server. Instead it only works on the server with the recent start of keepalived.
My
ip -a
command would return like this for **Master**:
inet 10.10.10.15/24 brd 10.10.10.255 scope global noprefixroute ens190
inet 10.10.10.14/32 scope global ens190
For **Backup**:
inet 10.10.10.18/24 brd 10.10.10.255 scope global noprefixroute ens165
inet 10.10.10.14/32 scope global ens165
Anything wrong? I have also set net.ipv4.ip_nonlocal_bind = 1
on my sysctl configuration. My logs only show the start and stop of the service?
Gwynn
(41 rep)
Jul 15, 2020, 05:35 AM
• Last activity: Jul 15, 2020, 11:51 PM
2
votes
3
answers
1346
views
What is the best way to store a single counter persistently?
I have a simple bash script that increments a counter a few times per second, guaranteed less than 100 times per second. The script works fine, but I would like the counter to persist on machine crashes. What would be the best way to persist the counter on my SSD-only system? Should I just echo it o...
I have a simple bash script that increments a counter a few times per second, guaranteed less than 100 times per second. The script works fine, but I would like the counter to persist on machine crashes.
What would be the best way to persist the counter on my SSD-only system? Should I just echo it out to
/var//
somewhere (i.e. store in a file) each time it updates? If so, is /var//
the right place? Do I need to install a full database to keep track of this single value? Is there some cute little Linux feature built to do this effectively?
To clarify, my problem isn't making sure that the counter is persistent between separate runs of the script, I have that solved already. My concern is in case the system unexpectedly and suddenly fails due to machine crash (I can therefore not rely on a trap
in a shell script).
00prometheus
(813 rep)
Jun 1, 2020, 05:49 PM
• Last activity: Jun 3, 2020, 11:50 AM
0
votes
2
answers
1078
views
Linux Pacemaker: Resource showing as "unrunnable start (blocked)" has been created
We are using SLES 12 SP4 We have observed few things from the today testing. Following are the steps: **Step 1**: When we create kernel panic (on Node01) with the command “**echo 'b' > /proc/sysrq-trigger**” or “**echo 'c' > /proc/sysrq-trigger**” on the node where the resources are running, then th...
We are using SLES 12 SP4
We have observed few things from the today testing. Following are the steps:
**Step 1**: When we create kernel panic (on Node01) with the command “**echo 'b' > /proc/sysrq-trigger**” or “**echo 'c' > /proc/sysrq-trigger**” on the node where the resources are running, then the cluster detecting the change but unable to start any resources (except SBD) on other active node.
**Step 2**: As per the logs we can find the following errors:
pengine: info: LogActions: Leave stonith-sbd (Started node02)
pengine: notice: LogAction: * Start pri-javaiq (node02 ) due to unrunnable nfs_filesystem start (blocked)
pengine: notice: LogAction: * Start lb_health_probe (node02 ) due to unrunnable nfs_filesystem start (blocked)
pengine: notice: LogAction: * Start pri-ip_vip (node02 ) due to unrunnable nfs_filesystem start (blocked)
pengine: notice: LogAction: * Start nfs_filesystem (node02 ) blocked
**Step 3**: But when we execute “init 6” on the node (on which we have created ‘kernel panic’), surprisingly the resources on other node are starting and running successfully.
Ram Too
(1 rep)
May 14, 2020, 02:25 PM
• Last activity: May 15, 2020, 04:47 PM
0
votes
0
answers
35
views
high availability of file in Linux
I have a very specific scenario. I have set of config files and I want to maintain it at two different paths which would be nothing but a copy of the same files. For some reason, if one of the location becomes unavailable my process should still access the files from the second location. How can I a...
I have a very specific scenario. I have set of config files and I want to maintain it at two different paths which would be nothing but a copy of the same files. For some reason, if one of the location becomes unavailable my process should still access the files from the second location.
How can I achieve this scenario through something like a symbolic link that takes care of this file pointer based on the availability.
Any thoughts or ideas are highly appreciated.
Thanks much -
Parthi
(1 rep)
Apr 30, 2020, 06:06 AM
1
votes
0
answers
310
views
OpenLDAP Cluster
Trying to implement an OpenLDAP cluster, I already managed to set up the two backend LDAP servers in mirroring mode. The application (iRedMail) using the LDAP service is running on the same systems as the LDAP servers. This applications needs the LDAP configuration in the former slapd.conf manner an...
Trying to implement an OpenLDAP cluster, I already managed to set up the two backend LDAP servers in mirroring mode.
The application (iRedMail) using the LDAP service is running on the same systems as the LDAP servers. This applications needs the LDAP configuration in the former slapd.conf manner and not in the CONFIG-DB way. So I added the mirroring parameters to the slapd.conf file. The file looks like this on the first backend node:
include /etc/openldap/schema/core.schema
include /etc/openldap/schema/corba.schema
include /etc/openldap/schema/cosine.schema
include /etc/openldap/schema/inetorgperson.schema
include /etc/openldap/schema/nis.schema
include /etc/openldap/schema/calentry.schema
include /etc/openldap/schema/calresource.schema
include /etc/openldap/schema/amavisd-new.schema
include /etc/openldap/schema/iredmail.schema
pidfile /var/run/openldap/slapd.pid
argsfile /var/run/openldap/slapd.args
# The syncprov overlay
moduleload syncprov.la
disallow bind_anon
require LDAPv3
loglevel 0
access to attrs="userPassword,mailForwardingAddress,employeeNumber"
by anonymous auth
by self write
by dn.exact="cn=vmail,dc=myCompany,dc=de" read
by dn.exact="cn=vmailadmin,dc=myCompany,dc=de" write
by users none
access to attrs="cn,sn,gn,givenName,telephoneNumber"
by anonymous auth
by self write
by dn.exact="cn=vmail,dc=myCompany,dc=de" read
by dn.exact="cn=vmailadmin,dc=myCompany,dc=de" write
by users read
access to attrs="objectclass,domainName,mtaTransport,enabledService,domainSenderBccAddress,domainRecipientBccAddress,domainBackupMX,domainMaxQuotaSize,domainMaxUserNumber,domainPendingAliasName"
by anonymous auth
by self read
by dn.exact="cn=vmail,dc=myCompany,dc=de" read
by dn.exact="cn=vmailadmin,dc=myCompany,dc=de" write
by users read
access to attrs="domainAdmin,domainGlobalAdmin,domainSenderBccAddress,domainRecipientBccAddress"
by anonymous auth
by self read
by dn.exact="cn=vmail,dc=myCompany,dc=de" read
by dn.exact="cn=vmailadmin,dc=myCompany,dc=de" write
by users none
access to attrs="mail,accountStatus,domainStatus,userSenderBccAddress,userRecipientBccAddress,mailQuota,backupMailAddress,shadowAddress,memberOfGroup,member,uniqueMember,storageBaseDirectory,homeDirectory,mailMessageStore,mailingListID"
by anonymous auth
by self read
by dn.exact="cn=vmail,dc=myCompany,dc=de" read
by dn.exact="cn=vmailadmin,dc=myCompany,dc=de" write
by users read
access to dn="cn=vmail,dc=myCompany,dc=de"
by anonymous auth
by self write
by users none
access to dn="cn=vmailadmin,dc=myCompany,dc=de"
by anonymous auth
by self write
by users none
access to dn.regex="domainName=([^,]+),o=domains,dc=myCompany,dc=de$"
by anonymous auth
by self write
by dn.exact="cn=vmail,dc=myCompany,dc=de" read
by dn.exact="cn=vmailadmin,dc=myCompany,dc=de" write
by dn.regex="mail=[^,]+@$1,o=domainAdmins,dc=myCompany,dc=de$" write
by dn.regex="mail=[^,]+@$1,ou=Users,domainName=$1,o=domains,dc=myCompany,dc=de$" read
by users none
access to dn.subtree="o=domains,dc=myCompany,dc=de"
by anonymous auth
by self write
by dn.exact="cn=vmail,dc=myCompany,dc=de" read
by dn.exact="cn=vmailadmin,dc=myCompany,dc=de" write
by users read
access to dn.subtree="o=domainAdmins,dc=myCompany,dc=de"
by anonymous auth
by self write
by dn.exact="cn=vmail,dc=myCompany,dc=de" read
by dn.exact="cn=vmailadmin,dc=myCompany,dc=de" write
by users none
access to dn.regex="cn=[^,]+,dc=myCompany,dc=de"
by anonymous auth
by self write
by users none
access to *
by anonymous auth
by self write
by users read
database monitor
access to dn="cn=monitor"
by dn.exact="cn=Manager,dc=myCompany,dc=de" read
by dn.exact="cn=vmail,dc=myCompany,dc=de" read
by * none
database mdb
suffix dc=myCompany,dc=de
directory /var/lib/ldap/myCompany.de
rootdn cn=Manager,dc=myCompany,dc=de
rootpw {SSHA}V5/UQXm9SmzRGjKK2zAKB79eFSaysc2wG9tPIg==
sizelimit unlimited
maxsize 2147483648
checkpoint 128 3
mode 0700
index objectclass,entryCSN,entryUUID eq
index uidNumber,gidNumber,uid,memberUid,loginShell eq,pres
index homeDirectory,mailMessageStore eq,pres
index ou,cn,mail,surname,givenname,telephoneNumber,displayName eq,pres,sub
index nisMapName,nisMapEntry eq,pres,sub
index shadowLastChange eq,pres
index member,uniqueMember eq,pres
index domainName,mtaTransport,accountStatus,enabledService,disabledService eq,pres,sub
index domainAliasName eq,pres,sub
index domainMaxUserNumber eq,pres
index domainAdmin,domainGlobalAdmin,domainBackupMX eq,pres,sub
index domainSenderBccAddress,domainRecipientBccAddress eq,pres,sub
index accessPolicy,hasMember,listAllowedUser,mailingListID eq,pres,sub
index mailForwardingAddress,shadowAddress eq,pres,sub
index backupMailAddress,memberOfGroup eq,pres,sub
index userRecipientBccAddress,userSenderBccAddress eq,pres,sub
index mobile,departmentNumber eq,pres,sub
#Mirror Mode
serverID 001
# Consumer
syncrepl rid=001 \
provider=ldap://rm2.myCompany.de \
bindmethod=simple \
binddn="cn=vmail,dc=myCompany,dc=de" \
credentials="gtV9FwILIcp8Zw8YtGeB1AC9GbGfti" \
searchbase="dc=myCompany,dc=de" \
attrs="*,+" \
type=refreshAndPersist \
interval=00:00:01:00 \
retry="60 +"
# Provider
overlay syncprov
syncprov-checkpoint 50 1
syncprov-sessionlog 50
mirrormode on
There are only two differences in the second node's config file:
[...]
#Mirror Mode
serverID 002
[...]
# Consumer
[...]
provider=ldap://rm2.myCompany.de \
[...]
As mentioned before the mirroring works perfectly.
Now I need a single connection address for the LDAP clients, i.e. web applications using LDAP as authentication mechanism.
I read that you can use an OpenLDAP proxy for that purpose. The LDAP client (here: web application) connects to the LDAP proxy and the proxy will retrieve the authentication data from multiple backend LDAP servers.
I set up an OpenLDAP proxy, it uses CONFIG-DB, not the ancient way. The slapd.conf file looks like this:
include /etc/openldap/schema/corba.schema
include /etc/openldap/schema/core.schema
include /etc/openldap/schema/cosine.schema
include /etc/openldap/schema/duaconf.schema
include /etc/openldap/schema/dyngroup.schema
include /etc/openldap/schema/inetorgperson.schema
include /etc/openldap/schema/java.schema
include /etc/openldap/schema/misc.schema
include /etc/openldap/schema/nis.schema
include /etc/openldap/schema/openldap.schema
include /etc/openldap/schema/ppolicy.schema
pidfile /var/run/openldap/slapd.pid
argsfile /var/run/openldap/slapd.args
modulepath /usr/lib/openldap
modulepath /usr/lib64/openldap
moduleload back_ldap.la
loglevel 0
database ldap
readonly yes
protocol-version 3
rebind-as-user
uri "ldap://rm1.myCompany.de:389"
suffix "dc=myCompany,dc=de"
uri "ldap://rm2.myCompany.de:389"
suffix "dc=myCompany,dc=de"
First issue:
Creating the CONFIG-DB using slaptest, the command fails, claiming:
5dc44107 /etc/openldap/slapd.conf: line 48: suffix already served by this backend!.
slaptest: bad configuration directory!
The slaptest command looks like this:
slaptest -f /etc/openldap/slapd.conf -F /etc/openldap/slapd.d/
It is possible that I didn't understand completely the concept, because all guides I found are using subdomain prefixes for the different LDAP backend servers, i.e. instead of:
uri "ldap://rm1.myCompany.de:389"
suffix "dc=myCompany,dc=de"
uri "ldap://rm2.myCompany.de:389"
suffix "dc=myCompany,dc=de"
they use:
uri "ldap://rm1.myCompany.de:389"
suffix "dc=ou1,dc=myCompany,dc=de"
uri "ldap://rm2.myCompany.de:389"
suffix "dc=ou2,dc=myCompany,dc=de"
What I didn't understand: On the backend servers there is no ou1 and ou2 respectively. How can they expect to find anything in the backend LDAPs if the DNs do not match?
I temporarily commented the second uri in order to check if, apart from this issue, LDAP queries to the LDAP proxy succeed, but ran into the second issue.
Second issue:
If I run an ldapsearch against directly to the two backend LDAP servers (one after the other), all of the LDAP users will be enumerated.
If I run the same ldapsearch against the LDAP proxy, only the user "vmail" will be enumerated. I think that the same users should be listed as in the direct query.
This is the ldapsearch command:
ldapsearch -D "cn=vmail,dc=myCompany,dc=de" -w gtV9FwILIcp8Zw8YtGeB1AC9GbGfti -p 389 -h 192.168.0.92 -b "dc=myCompany,dc=de" -s sub "(objectclass=person)"
Did I miss sth.?
Thank you for your considerations!
Best regards,
Florian
arminV
(11 rep)
Nov 8, 2019, 10:45 AM
1
votes
1
answers
2580
views
Pacemaker: Primary node is rebooted and comes back is primary instead of standby
We are using pacemaker, corosync to automate failovers. We noticed one behaviour- when primary node is rebooted, the standby node takes over as primary - which is fine. When the node comes back online and services are started on it, it takes back the role of Primary. It should ideally start as stand...
We are using pacemaker, corosync to automate failovers. We noticed one behaviour- when primary node is rebooted, the standby node takes over as primary - which is fine.
When the node comes back online and services are started on it, it takes back the role of Primary. It should ideally start as standby.
Are we missing any configuration?
> pcs resource defaults
O/p:
resource-stickiness: INFINITY
migration-threshold: 0
Stickiness is set to INFINITY. Please suggest.
Adding Config details:
======================
[root@Node1 heartbeat]# pcs config show –l Cluster Name: cluster1 Corosync Nodes: Node1 Node2 Pacemaker Nodes: Node1 Node2 Resources: Master: msPostgresql Meta Attrs: master-node-max=1 clone-max=2 notify=true master-max=1 clone-node-max=1 Resource: pgsql (class=ocf provider=heartbeat type=pgsql) Attributes: master_ip=10.70.10.1 node_list="Node1 Node2" pgctl=/usr/pgsql-9.6/bin/pg_ctl pgdata=/var/lib/pgsql/9.6/data/ primary_conninfo_opt="keepalives_idle=60 keepalives_interval=5 keepalives_count=5" psql=/usr/pgsql-9.6/bin/psql rep_mode=async restart_on_promote=true restore_command="cp /var/lib/pgsql/9.6/data/archivedir/%f %p" Meta Attrs: failure-timeout=60 Operations: demote interval=0s on-fail=stop timeout=60s (pgsql-demote-interval-0s) methods interval=0s timeout=5s (pgsql-methods-interval-0s) monitor interval=4s on-fail=restart timeout=60s (pgsql-monitor-interval-4s) monitor interval=3s on-fail=restart role=Master timeout=60s (pgsql-monitor-interval-3s) notify interval=0s timeout=60s (pgsql-notify-interval-0s) promote interval=0s on-fail=restart timeout=60s (pgsql-promote-interval-0s) start interval=0s on-fail=restart timeout=60s (pgsql-start-interval-0s) stop interval=0s on-fail=block timeout=60s (pgsql-stop-interval-0s) Group: master-group Resource: vip-master (class=ocf provider=heartbeat type=IPaddr2) Attributes: cidr_netmask=24 ip=10.70.10.2 Operations: monitor interval=10s on-fail=restart timeout=60s (vip-master-monitor-interval-10s) start interval=0s on-fail=restart timeout=60s (vip-master-start-interval-0s) stop interval=0s on-fail=block timeout=60s (vip-master-stop-interval-0s) Resource: vip-rep (class=ocf provider=heartbeat type=IPaddr2) Attributes: cidr_netmask=24 ip=10.70.10.1 Meta Attrs: migration-threshold=0 Operations: monitor interval=10s on-fail=restart timeout=60s (vip-rep-monitor-interval-10s) start interval=0s on-fail=stop timeout=60s (vip-rep-start-interval-0s) stop interval=0s on-fail=ignore timeout=60s (vip-rep-stop-interval-0s) Stonith Devices: Fencing Levels: Location Constraints: Ordering Constraints: promote msPostgresql then start master-group (score:INFINITY) (non-symmetrical) demote msPostgresql then stop master-group (score:0) (non-symmetrical) Colocation Constraints: master-group with msPostgresql (score:INFINITY) (rsc-role:Started) (with-rsc-role:Master) Ticket Constraints: Alerts: No alerts defined Resources Defaults: resource-stickiness: INFINITY migration-threshold: 0 Operations Defaults: No defaults set Cluster Properties: cluster-infrastructure: corosync cluster-name: cluster1 cluster-recheck-interval: 60 dc-version: 1.1.19-8.el7-c3c624ea3d have-watchdog: false no-quorum-policy: ignore start-failure-is-fatal: false stonith-enabled: false Node Attributes: Node1: pgsql-data-status=STREAMING|ASYNC Node2: pgsql-data-status=LATEST Quorum: Options:Thanks !
User2019
(11 rep)
Sep 12, 2019, 09:30 AM
• Last activity: Sep 16, 2019, 06:18 PM
1
votes
1
answers
1268
views
how to unexport NFS share on VCS HS cluster
**see imp update at bottom of orig. question. not sure how to unexport only the 'world' mountable share? I have a NFS server which had a share with world-mountable permissions. To make it mountable only by the clients on a subnet i added the share to /etc/exports, which was empty before. I am not su...
**see imp update at bottom of orig. question.
not sure how to unexport only the 'world' mountable share? I have a NFS server which had a share with world-mountable permissions. To make it mountable only by the clients on a subnet i added the share to /etc/exports, which was empty before. I am not sure how the folder was shared before?? I put the entry in /etc/exports and shared again, but it is still showing world mountable share available.
before:
[root@nfsServer ~]# exportfs -v
/export/home (rw,wdelay,no_root_squash,no_subtree_check)
# ls -l /var/lib/nfs/xtab
-rw-r--r-- 1 root root 0 Dec 15 2009 /var/lib/nfs/xtab
# ls -l /proc/fs/nfs
-r--r--r-- 1 root root 0 May 2 00:41 exports
change:
added following line to /etc/exports (which was empty before)
/export/home 192.168.253.0/24(rw,wdelay,no_root_squash,no_subtree_check)
then re-export folders:
# exportfs -ra
after:
[root@nfsServer ~]# exportfs -v
/export/home 192.168.253.0/24(rw,wdelay,no_root_squash,no_subtree_check)
/export/home (rw,wdelay,no_root_squash,no_subtree_check)
# cat /etc/exports
/export/home 192.168.253.0/24(rw,wdelay,no_root_squash,no_subtree_check)
# ls -l /var/lib/nfs/xtab
-rw-r--r-- 1 root root 0 Dec 15 2009 /var/lib/nfs/xtab
# ls -l /proc/fs/nfs
-r--r--r-- 1 root root 0 May 2 00:41 exports
[root@nfsServer ~]# ls -ltr /proc/fs/nfsd
total 0
-rw------- 1 root root 0 Mar 1 2017 versions
-rw------- 1 root root 0 Mar 1 2017 threads
-rw------- 1 root root 0 Mar 1 2017 portlist
-rw------- 1 root root 0 Mar 1 2017 nfsv4recoverydir
-rw------- 1 root root 0 Mar 1 2017 nfsv4leasetime
-rw------- 1 root root 0 Mar 1 2017 filehandle
-r--r--r-- 1 root root 0 Mar 1 2017 exports
[root@nfsServer ~]# cd /proc/fs/nfsd
[root@nfsServer nfsd]# cat exports
# Version 1.1
# Path Client(Flags) # IPs
/export/home *,192.168.253.0/24(rw,no_root_squash,sync,wdelay,no_subtree_check)
# cat versions
+2 +3 -4
Note that it has * added in front of the /etc/exports entry. I want to know where is the "*" entry coming from and how to get rid of it. All help is appreciated.
system:
Red Hat Enterprise Linux Server release 5.5 (Tikanga) 2.6.18-194.el5 #1 SMP Tue Mar 16 21:52:39 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux
*IMP: sorry i missed to clarify that this is NFS running on VCS HA on redhat 5.5. so when i restart nfs, i get err:
# service nfs stop
Shutting down NFS mountd: [ OK ]
Shutting down NFS daemon: [ OK ]
Shutting down NFS quotas: [ OK ]
Shutting down NFS services: [ OK ]
# service nfs start
Starting NFS services: [ OK ]
Starting NFS quotas: [ OK ]
Starting NFS daemon: [FAILED]
# service nfs start
Starting NFS services: [ OK ]
Starting NFS quotas: [ OK ]
Starting NFS daemon: [FAILED]
but when you check...
# service nfs status
rpc.mountd (pid 24103) is running...
nfsd (pid 24052 24051 24050 24049 24048 24047 24046 24045) is running...
rpc.rquotad (pid 22872 20490 19133) is running...
I figured that in VCS main.cf this line sets up the 'nfs' share: but i am not sure how to add subnet restriction to it...
Share share_home (
Options = "rw, no_root_squash"
PathName = "/export/home"
)
Thanks.
Raj
Rajeev
(256 rep)
May 2, 2018, 12:55 AM
• Last activity: May 2, 2018, 05:04 PM
2
votes
0
answers
434
views
Setting up a kerberized HA NFS share
I'm trying to set up a kerberized NFS share from an HA cluster. I've successfully set up a krb-aware NFS share from a single server, I'm using a mostly identical configuration on the cluster. Exports file from working single server: /nfs *(rw,sec=krb5:krb5i:krb5p) Cluster resource configuration: # p...
I'm trying to set up a kerberized NFS share from an HA cluster. I've successfully set up a krb-aware NFS share from a single server, I'm using a mostly identical configuration on the cluster.
Exports file from working single server:
/nfs *(rw,sec=krb5:krb5i:krb5p)
Cluster resource configuration:
# pcs resource show nfs-export1
Resource: nfs-export1 (class=ocf provider=heartbeat type=exportfs)
Attributes: clientspec=10.1.0.0/255.255.255.0 directory=/nfsshare/exports/export1 fsid=1 options=rw,sec=krb5:krb5i:krb5p,sync,no_root_squash
Operations: monitor interval=10 timeout=20 (nfs-export1-monitor-interval-10)
start interval=0s timeout=40 (nfs-export1-start-interval-0s)
stop interval=0s timeout=120 (nfs-export1-stop-interval-0s)
Client
showmount
to working single server:
# showmount -e ceserv
Export list for ceserv:
/nfs *
Client showmount
to floating cluster name:
# showmount -e hafloat
Export list for hafloat:
/nfsshare/exports/export1 10.1.0.0/255.255.255.0
/nfsshare/exports 10.1.0.0/255.255.255.0
Contents of client /etc/fstab
:
ceserv:/nfs /mnt/nfs nfs4 sec=krb5i,rw,proto=tcp,port=2049
hafloat.ncphotography.lan:export1 /nfsmount nfs4 sec=krb5i,rw,proto=tcp,port=2049
Results of mount -av
command:
# mount -av
mount.nfs4: timeout set for Mon Dec 4 20:57:14 2017
mount.nfs4: trying text-based options 'sec=krb5i,proto=tcp,port=2049,vers=4.1,addr=10.1.0.24,clientaddr=10.1.0.23'
/mnt/nfs : successfully mounted
mount.nfs4: timeout set for Mon Dec 4 20:57:14 2017
mount.nfs4: trying text-based options 'sec=krb5i,proto=tcp,port=2049,vers=4.1,addr=10.1.0.29,clientaddr=10.1.0.23'
mount.nfs4: mount(2): Operation not permitted
mount.nfs4: Operation not permitted
All firewalls have been disabled. All names resolve correctly to IP addresses within the 10.1.0.0/24 network, and all IP addresses reverse-resolve to the correct hostname.
John
(17381 rep)
Dec 4, 2017, 09:05 PM
• Last activity: Mar 6, 2018, 03:25 PM
Showing page 1 of 20 total questions