Sample Header Ad - 728x90

Unix & Linux Stack Exchange

Q&A for users of Linux, FreeBSD and other Unix-like operating systems

Latest Questions

0 votes
1 answers
2844 views
Apache resource failed to start in Pacemaker
I am using Pacemaker with Corosync to set up a basic Apache HA cluster with 3 nodes running CentOS7. For some reasons, I cannot get the apache resource started in pcs. Cluster IP: 192.168.200.40 # pcs resource show ClusterIP Resource: ClusterIP (class=ocf provider=heartbeat type=IPaddr2) Attributes:...
I am using Pacemaker with Corosync to set up a basic Apache HA cluster with 3 nodes running CentOS7. For some reasons, I cannot get the apache resource started in pcs. Cluster IP: 192.168.200.40 # pcs resource show ClusterIP Resource: ClusterIP (class=ocf provider=heartbeat type=IPaddr2) Attributes: cidr_netmask=24 ip=192.168.200.40 Operations: monitor interval=20s (ClusterIP-monitor-interval-20s) start interval=0s timeout=20s (ClusterIP-start-interval-0s) stop interval=0s timeout=20s (ClusterIP-stop-interval-0s) # pcs resource show WebServer Resource: WebServer (class=ocf provider=heartbeat type=apache) Attributes: configfile=/etc/httpd/conf/httpd.conf statusurl=http://localhost/server-status Operations: monitor interval=1min (WebServer-monitor-interval-1min) start interval=0s timeout=40s (WebServer-start-interval-0s) stop interval=0s timeout=60s (WebServer-stop-interval-0s) # pcs status Cluster name: WARNING: corosync and pacemaker node names do not match (IPs used in setup?) Stack: corosync Current DC: server3.example.com (version 1.1.18-11.el7_5.2-2b07d5c5a9) - partition with quorum Last updated: Thu Jun 7 21:59:09 2018 Last change: Thu Jun 7 21:45:23 2018 by root via cibadmin on server1.example.com 3 nodes configured 2 resources configured Online: [ server1.example.com server2.example.com server3.example.com ] Full list of resources: ClusterIP (ocf::heartbeat:IPaddr2): Started server2.example.com WebServer (ocf::heartbeat:apache): Stopped Failed Actions: * WebServer_start_0 on server3.example.com 'unknown error' (1): call=49, status=Timed Out, exitreason='', last-rc-change='Thu Jun 7 21:46:03 2018', queued=0ms, exec=40002ms * WebServer_start_0 on server1.example.com 'unknown error' (1): call=53, status=Timed Out, exitreason='', last-rc-change='Thu Jun 7 21:45:23 2018', queued=0ms, exec=40003ms * WebServer_start_0 on server2.example.com 'unknown error' (1): call=47, status=Timed Out, exitreason='', last-rc-change='Thu Jun 7 21:46:43 2018', queued=1ms, exec=40002ms Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled The httpd instance is **enabled** and **running** on all three nodes. The cluster IP and individual node IPs are able to access the web page. The ClusterIP resource also works well for failover. What may go wrong for the apache resource in this case? Thank you very much! Update: Here is more information from the debug output. It seems the Apache is unable to bind to the port, but there is no error from the apache log, and systemctl status httpd gave all green on all nodes. I can open web pages via the cluster IP and each every node IP. The ClusterIP resource failover works fine, too. Any idea on why Apache resource doesn't work with pacemaker? # pcs resource debug-start WebServer --full Operation start for WebServer (ocf:heartbeat:apache) failed: 'Timed Out' (2) > stderr: ERROR: (98)Address already in use: AH00072: make_sock: could not bind to address [::]:80 (98)Address already in use: AH00072: make_sock: could not bind to address 0.0.0.0:80 no listening sockets available, shutting down AH00015: Unable to open logs > stderr: INFO: apache not running > stderr: INFO: waiting for apache /etc/httpd/conf/httpd.conf to come up > stderr: INFO: apache not running > stderr: INFO: waiting for apache /etc/httpd/conf/httpd.conf to come up > stderr: INFO: apache not running > stderr: INFO: waiting for apache /etc/httpd/conf/httpd.conf to come up > stderr: INFO: apache not running
cody (67 rep)
Jun 8, 2018, 04:16 PM • Last activity: Jul 15, 2025, 02:03 AM
0 votes
1 answers
2303 views
Secondary DRBD node does not auto-start in Pacemaker+Corosync setup
I am trying to set up a 2-PC cluster with shared resources: `ClusterIP`, `ClusterSamba`, `ClusterNFS`, `DRBD` (cloned resource), and a `DRBDFS`. The beginning of the project followed the [Clusters from Scratch](https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html/Clusters_from_Scratch/inde...
I am trying to set up a 2-PC cluster with shared resources: ClusterIP, ClusterSamba, ClusterNFS, DRBD (cloned resource), and a DRBDFS. The beginning of the project followed the [Clusters from Scratch](https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html/Clusters_from_Scratch/index.html) guide. When everything in this guide is done, it works without problems. So, I wanted to use parts of that guide and build my own setup: I created one shared IP (ClusterIP) that is automatically assigned to one node, and (here is where it gets tricky) on that node, I mount my /dev/drbd1 device to /exports and then share this mount through **SAMBA** and **NFS**. When I start the cluster, all resources come up as they should, _but DRBD does not go up on the secondary node_ (Primary/Unknown). If I bring it up manually, it syncs and works. Also, when I stop the cluster (or forcibly reboot the first node), all resources transfer to the other node and everything works, _except DRBD on the other node goes into an Unknown state_. ### So now, here is the problem: **Why does DRBD go down on the secondary node when I stop the cluster? Or why doesn't it start in the Secondary role on the secondary node?** Sorry if my description is bad. --- ## Here are the commands I used
# apt install -y pacemaker pcs psmisc policycoreutils-python-utils drbd-utils samba nfs-kernel-server 
# systemctl start pcsd.service
# systemctl enable pcsd.service
# passwd hacluster
# pcs host auth alice bob
# pcs cluster setup myCluster alice bob --force
# pcs cluster start --all
# pcs property set stonith-enabled=false
# pcs property set no-quorum-policy=ignore
# modprobe drbd
# echo drbd >/etc/modules-load.d/drbd.conf
# drbdadm create-md r0
# drbdadm up r0
# drbdadm primary r0 --force
# mkfs.ext4 /dev/drbd1
# systemctl disable smbd
# systemctl disable nfs-kernel-server.service 
# mkdir /exports
# vi /etc/samba/smb.conf 
# vi /etc/exports 
# pcs resource create ClusterIP ocf:heartbeat:IPaddr2 ip=10.1.1.30 cidr_netmask=24 op monitor interval=30s
# pcs resource defaults resource-stickiness=100
# pcs resource op defaults timeout=240s
# pcs resource create ClusterSamba lsb:smbd op monitor interval=60s
# pcs resource create ClusterNFS ocf:heartbeat:nfsserver op monitor interval=60s
# pcs resource create DRBD ocf:linbit:drbd drbd_resource=r0 op monitor interval=60s
# pcs resource promotable DRBD promoted-max=1 promoted-node-max=1 clone-max=2 clone-node-max=1 notify=true
# pcs resource create DRBDFS Filesystem device="/dev/drbd1" directory="/exports" fstype="ext4"
# pcs constraint order ClusterIP then ClusterNFS
# pcs constraint order ClusterNFS then ClusterSamba
# pcs constraint order promote DRBD-clone then start DRBDFS
# pcs constraint order DRBDFS then ClusterNFS
# pcs constraint order ClusterIP then DRBD-clone
# pcs constraint colocation ClusterSamba with ClusterIP
# pcs constraint colocation add ClusterSamba with ClusterIP
# pcs constraint colocation add ClusterNFS with ClusterIP
# pcs constraint colocation add DRBDFS with DRBD-clone INFINITY with-rsc-role=Master
# pcs constraint colocation add DRBD-clone with ClusterIP
# pcs cluster stop --all && sleep 2 && pcs cluster start --all
--- ## Configs and stats ### /etc/drbd.d/r0.res
resource r0 {
 device /dev/drbd1;
 disk /dev/sdb;
 meta-disk internal;
 net {
  allow-two-primaries;
 }
 on alice {
  address 10.1.1.31:7788;
 }
 on bob {
  address 10.1.1.32:7788;
 } 
}
--- ### /etc/corosync/corosync.conf
totem {
    version: 2
    cluster_name: myCluster
    transport: knet
    crypto_cipher: aes256
    crypto_hash: sha256
}

nodelist {
    node {
        ring0_addr: alice
        name: alice
        nodeid: 1
    }

    node {
        ring0_addr: bob
        name: bob
        nodeid: 2
    }
}

quorum {
    provider: corosync_votequorum
    two_node: 1
}

logging {
    to_logfile: yes
    logfile: /var/log/corosync/corosync.log
    to_syslog: yes
    timestamp: on
}
--- ### pcs status
Cluster name: myCluster
Stack: corosync
Current DC: alice (version 2.0.1-9e909a5bdd) - partition with quorum
Last updated: Fri May 15 12:28:30 2020
Last change: Fri May 15 11:04:50 2020 by root via cibadmin on bob

2 nodes configured
6 resources configured

Online: [ alice bob ]

Full list of resources:

 ClusterIP      (ocf::heartbeat:IPaddr2):       Started alice
 ClusterSamba   (lsb:smbd):     Started alice
 ClusterNFS     (ocf::heartbeat:nfsserver):     Started alice
 Clone Set: DRBD-clone [DRBD] (promotable)
 Masters: [ alice ]
 Stopped: [ bob ]
 DRBDFS (ocf::heartbeat:Filesystem):    Started alice

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled
--- ### pcs constraint --full
Location Constraints:

Ordering Constraints:
  start ClusterIP then start ClusterNFS (kind:Mandatory) (id:order-ClusterIP-ClusterNFS-mandatory)
  start ClusterNFS then start ClusterSamba (kind:Mandatory) (id:order-ClusterNFS-ClusterSamba-mandatory)
  promote DRBD-clone then start DRBDFS (kind:Mandatory) (id:order-DRBD-clone-DRBDFS-mandatory)
  start DRBDFS then start ClusterNFS (kind:Mandatory) (id:order-DRBDFS-ClusterNFS-mandatory)
  start ClusterIP then start DRBD-clone (kind:Mandatory) (id:order-ClusterIP-DRBD-clone-mandatory)
  start ClusterIP then promote DRBD-clone (kind:Mandatory) (id:order-ClusterIP-DRBD-clone-mandatory-1)

Colocation Constraints:
  ClusterSamba with ClusterIP (score:INFINITY) (id:colocation-ClusterSamba-ClusterIP-INFINITY)
  ClusterNFS with ClusterIP (score:INFINITY) (id:colocation-ClusterNFS-ClusterIP-INFINITY)
  DRBDFS with DRBD-clone (score:INFINITY) (with-rsc-role:Master) (id:colocation-DRBDFS-DRBD-clone-INFINITY)
  DRBD-clone with ClusterIP (score:INFINITY) (id:colocation-DRBD-clone-ClusterIP-INFINITY)

Ticket Constraints:
--- ### /proc/drbd
version: 8.4.10 (api:1/proto:86-101)
srcversion: 983FCB77F30137D4E127B83 

 1: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r-----
    ns:0 nr:4 dw:8 dr:17 al:1 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:4
Miki (31 rep)
May 15, 2020, 11:12 AM • Last activity: Jun 19, 2025, 10:03 PM
2 votes
1 answers
2579 views
After failover Pacemaker moves resource back when node comes back
I'm using Pacemaker & Corosync for my cluster. When a node dies pacemaker moving my resources to another online node. Everything ok here. But when the dead node comes back, Pacemaker moving the resource back. I don't have any "location" line in my config and also I tried with "unmove" command but no...
I'm using Pacemaker & Corosync for my cluster. When a node dies pacemaker moving my resources to another online node. Everything ok here. But when the dead node comes back, Pacemaker moving the resource back. I don't have any "location" line in my config and also I tried with "unmove" command but nothing changed. I failed at somewhere and need to find the reason. **crm configure sh** node 1: DEV1 node 2: DEV2 primitive poolip IPaddr2 \ params ip=10.1.60.33 nic=enp2s0f0 cidr_netmask=24 \ meta migration-threshold=2 target-role=Started \ op monitor interval=20 timeout=20 on-fail=restart primitive gui systemd:gui \ op monitor interval=20s \ meta target-role=Started primitive gui-ip IPaddr2 \ params ip=10.1.60.35 nic=enp2s0f0 cidr_netmask=24 \ meta migration-threshold=2 target-role=Started \ op monitor interval=20 timeout=20 on-fail=restart colocation cluster-gui inf: gui gui-ip order gui-after-ip Mandatory: gui-ip gui property cib-bootstrap-options: \ have-watchdog=false \ dc-version=2.0.0-1-8cf3fe749e \ cluster-infrastructure=corosync \ cluster-name=mycluster \ stonith-enabled=false \ no-quorum-policy=ignore \ last-lrm-refresh=1545920437 rsc_defaults rsc-options: \ migration-threshold=10 \ resource-stickiness=100 **pcs resource defaults** migration-threshold=10 resource-stickiness=100 **pcs resource show gui** Resource: gui (class=systemd type=gui) Meta Attrs: target-role=Started Operations: monitor interval=20s (gui-monitor-20s)
Ozbit (439 rep)
Jan 2, 2019, 08:58 AM • Last activity: Jun 14, 2025, 09:07 PM
1 votes
1 answers
2618 views
Pacemaker Virtual IP cannot be routed outside of its network
I have a server cluster consisted of following setup: 2 Virtual Servers with 2 NIC's. eth0 (private network 10.0.0.0/16) and eth1 (public network 77.1.2.0/24 with gateway as 77.1.2.1) For HA-01 VPS i have Private IP on eth0 set as 10.0.0.1 For HA-02 VPS i have Private IP set on eth0 as 10.0.0.2 Pace...
I have a server cluster consisted of following setup: 2 Virtual Servers with 2 NIC's. eth0 (private network 10.0.0.0/16) and eth1 (public network 77.1.2.0/24 with gateway as 77.1.2.1) For HA-01 VPS i have Private IP on eth0 set as 10.0.0.1 For HA-02 VPS i have Private IP set on eth0 as 10.0.0.2 Pacemaker/Corosync Cluster has been established between private IP addresses and Virtual IP (77.1.2.4) defined as clone Resource (IPAddr2) so it can float between two nodes. pcs resource create VirtualIP1 ocf:heartbeat:IPaddr2 ip="77.1.2.4" cidr_netmask="24" nic="eth1" clusterip_hash="sourceip-sourceport" op start interval="0s" timeout="60s" op monitor interval="1s" timeout="20s" op stop interval="0s" timeout="60s" clone interleave=true ordered=true Problem is, i cannot reach that IP address from world. I noticed that there is a route missing, so i add the static route ip r add default via 77.1.2.1 dev eth1 But i still cannot ping google.com from those servers nor world can see them on that IP. I also tried adding IP addresses from same subnet on eth1 like this: HA-01 eth1: 77.1.2.2 HA-02 eth1: 77.1.2.3 Servers can be seen on those IPs by world but if i add VirtualIP resource i cannot reach them on Virtual IP address. I also tried adding a source ip in routing table ip r add default via 77.1.2.1 src 77.1.2.4 to no avail. I don't know what am i supposed to do to get this VirtualIP working. I can reach 77.1.2.4 (Virtual IP Address) from other servers on that network, but not outside that network. Firewall is established and high availability ports are passed via command firewall-cmd --add-service="high availability"; firewall-cmd --add-service="high availability" --permanent Is there anything here that i am missing? If i add that address (77.1.2.4 - Virtual IP) alone on the interface of only one of those servers, it will work.... So is there an issue with ARP table perhaps or maybe router blocking some traffic?
Marko Todoric (437 rep)
Jul 19, 2019, 02:54 PM • Last activity: Apr 15, 2025, 03:08 AM
1 votes
1 answers
25 views
Drbd promote only after stonith started
I want that DRBD-based 2 nodes cluster start resources in the following order - 1. on both nodes start stonith:fence_ipmilan 2. on one node drbd-clone promote 3. on the same node as drbd promote, start all NFS server resources (ip, export,…) But how to tell pacemaker promote drbd-clone only after st...
I want that DRBD-based 2 nodes cluster start resources in the following order - 1. on both nodes start stonith:fence_ipmilan 2. on one node drbd-clone promote 3. on the same node as drbd promote, start all NFS server resources (ip, export,…) But how to tell pacemaker promote drbd-clone only after started stonith:fence_ipmilan on each of two nodes ? I tried pcs constraint order set ipmi-fence-memverge ipmi-fence-memverge2 action=start require-all=true sequential=false set ha-nfs-clone action=promote sequential=false require-all=false it seems that stonith:fence_ipmilan and drbd-clone promote start simultaneously… Anton
Anton Gavriliuk (11 rep)
Feb 4, 2025, 03:03 PM • Last activity: Feb 4, 2025, 07:12 PM
1 votes
2 answers
8573 views
pcs stonith not working
i have 2 virtual centos7 nodes , root can login passwordless among themself, i have configured stonith like this but the services are not coming up, fencing is not happening , im new to this, could someone help me rectify issue~ [root@node1 cluster]# pcs stonith create nub1 fence_virt pcmk_host_list...
i have 2 virtual centos7 nodes , root can login passwordless among themself, i have configured stonith like this but the services are not coming up, fencing is not happening , im new to this, could someone help me rectify issue~ [root@node1 cluster]# pcs stonith create nub1 fence_virt pcmk_host_list="node1" [root@node1 cluster]# pcs stonith create nub2 fence_virt pcmk_host_list="node2" [root@node1 cluster]# pcs stonith show nub1 (stonith:fence_virt): Stopped nub2 (stonith:fence_virt): Stopped [root@node1 cluster]# [root@node1 cluster]# [root@node1 cluster]# [root@node1 cluster]# [root@node1 cluster]# pcs status Cluster name: mycluster Stack: corosync Current DC: node2 (version 1.1.15-11.el7_3.5-e174ec8) - partition with quorum Last updated: Tue Jul 25 07:03:37 2017 Last change: Tue Jul 25 07:02:00 2017 by root via cibadmin on node1 2 nodes and 3 resources configured Online: [ node1 node2 ] Full list of resources: ClusterIP (ocf::heartbeat:IPaddr2): Started node1 nub1 (stonith:fence_virt): Stopped nub2 (stonith:fence_virt): Stopped Failed Actions: * nub1_start_0 on node1 'unknown error' (1): call=56, status=Error, exitreason='none', last-rc-change='Tue Jul 25 07:01:34 2017', queued=0ms, exec=7006ms * nub2_start_0 on node1 'unknown error' (1): call=58, status=Error, exitreason='none', last-rc-change='Tue Jul 25 07:01:42 2017', queued=0ms, exec=7009ms * nub1_start_0 on node2 'unknown error' (1): call=54, status=Error, exitreason='none', last-rc-change='Tue Jul 25 07:01:26 2017', queued=0ms, exec=7010ms * nub2_start_0 on node2 'unknown error' (1): call=60, status=Error, exitreason='none', last-rc-change='Tue Jul 25 07:01:34 2017', queued=0ms, exec=7013ms Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled [root@node1 cluster]# pcs stonith fence node2 Error: unable to fence 'node2' Command failed: No route to host [root@node1 cluster]# pcs stonith fence nub2 Error: unable to fence 'nub2' Command failed: No such device [root@node1 cluster]# ping node2 PING node2 (192.168.100.102) 56(84) bytes of data. 64 bytes from node2 (192.168.100.102): icmp_seq=1 ttl=64 time=0.247 ms 64 bytes from node2 (192.168.100.102): icmp_seq=2 ttl=64 time=0.304 ms ^C --- node2 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1001ms rtt min/avg/max/mdev = 0.247/0.275/0.304/0.032 ms
Mohammed Ali (691 rep)
Jul 25, 2017, 11:10 AM • Last activity: Feb 10, 2024, 02:01 AM
-3 votes
1 answers
373 views
Resource Group for file share not starting
I have a client that is trying to configure a file share for a 2 node cluster. Running this command seems to fix it but as soon as we switch over, it stops again. Any ideas? [![status][1]][1] [1]: https://i.sstatic.net/yv4Fs.png
I have a client that is trying to configure a file share for a 2 node cluster. Running this command seems to fix it but as soon as we switch over, it stops again. Any ideas? status
David Kranes (1 rep)
Dec 18, 2023, 02:35 PM • Last activity: Dec 28, 2023, 02:28 PM
0 votes
2 answers
2897 views
RHEL High-Availability Cluster using pcs, configuring service as a resource
I have a 2 node cluster on RHEL 6.9. Everything is configured except I'm having difficulty with an application launched via shell script that created into a service (in `/etc/init.d/myApplication`), which I'll just call "myApp" . From that application, I did a `pcs resource create myApp lsb:myApp op...
I have a 2 node cluster on RHEL 6.9. Everything is configured except I'm having difficulty with an application launched via shell script that created into a service (in /etc/init.d/myApplication), which I'll just call "myApp". From that application, I did a pcs resource create myApp lsb:myApp op monitor interval=30s op start on-fail=standby. I am new to using this suite of software but it's for work. What I need is for this application to be launched on both nodes simultaneously as it has to be started manually so if the first node fails, it would need intervention if it were not already active on the passive node. I have two other services: -VirtIP (ocf:heartbeat:IPaddr2) for providing a service IP for the application server -Cron (lsb:crond) to synchronize the application files (we are not using shared storage) I have the VirtIP and Cron as dependents via colocation to myApp. I've tried master/slave as well as cloning but I must be missing something regarding their config. If I take the application offline, pacemaker does not detect the service has gone down and pcs status outputs that myApp is still running on the node (or nodes depending on my config). I'm also sometimes getting the issue that the service running the app is stopped by pacemaker on the passive node. Which is the way I need to configure this? I've gone through the RHEL documentation but I'm still stuck. How do I get pacemaker to initiate failover if myApp service goes down? I don't know why it's not detecting the service has stopped in some cases. EDIT: So for testing purposes, I removed the password requirement for starting/restarting and the service starts/restarts fine as expected and the colocation dependent resources stop/start as expected. But stopping the myApp service does not reflect as a stopped resource but simply stays at Started node1. Likewise, simulating a failover via putting node1 into standby simply stops all resources on node1.
Greg (187 rep)
Sep 29, 2017, 07:52 AM • Last activity: Sep 6, 2023, 09:56 PM
0 votes
1 answers
358 views
Prevent promotion on specific node in Pacemaker
I have a drbd + pacemaker cluster with three nodes, one being a quorum device only. I'm trying to configure the pacemaker resource so that the promotable drbd-resource should run on all three devices, but it should never be promoted on the quorum device. I've tried setting location constraints on th...
I have a drbd + pacemaker cluster with three nodes, one being a quorum device only. I'm trying to configure the pacemaker resource so that the promotable drbd-resource should run on all three devices, but it should never be promoted on the quorum device. I've tried setting location constraints on the resource, but that results in pacemaker not starting the resource at all on the quorum device so drbd can't keep quorum on a failover. The desired state would be: - drbd resource is started on all three nodes - drbd resource is promotable - pacemaker will never promote the quorum device I can't find anything in the documentation, but what I'm envisioning would be a parameter like "don't promote on device X" that I have missed for the drbd resource.
comrain (3 rep)
Mar 13, 2023, 07:23 AM • Last activity: Mar 13, 2023, 06:55 PM
0 votes
1 answers
11525 views
DRBD - 'node1' not defined in your config (for this host) - Error when setting Primary
I am getting the following error when trying to set the Primary node for DRBD. 'node1' not defined in your config (for this host). I know this is related to DNS/Hostname/Hosts and the config clusterdb.res. I know this because I originally got an error when trying to start clusterdb.res if node1 didn...
I am getting the following error when trying to set the Primary node for DRBD. 'node1' not defined in your config (for this host). I know this is related to DNS/Hostname/Hosts and the config clusterdb.res. I know this because I originally got an error when trying to start clusterdb.res if node1 didn't resolve correctly. So what confuses me is that I can start the clusterdb.res if either use: *I have used this command on the hosts* hostnamectl set-hostname $(uname -n | sed s/\\..*//) To make the hostname resolve to node1 instead of node1.localdomain Or add node1.localdomain to the config, either works. But I have tried all combinations and can't seem to get this command to take : drbdadm primary --force node1 && cat /proc/drbd **My Configs** /etc/drbd.d/clusterdb.res resource clusterdb{ protocol C; meta-disk internal; device /dev/drbd0; startup { wfc-timeout 30; outdated-wfc-timeout 20; degr-wfc-timeout 30; } net { cram-hmac-alg sha1; shared-secret sync_disk; } syncer { rate 10M; al-extents 257; on-no-data-accessible io-error; verify-alg sha1; } on node1 { disk /dev/sda3; address 192.168.1.216:7788; } on node2 { disk /dev/sda3; address 192.168.1.217:7788; } } /etc/hosts : 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 192.168.1.216 node1 192.168.1.217 node2 /etc/hostname node# My full write up ATM (wip) **Edits :** [root@node1 ~]# hostname node1 [root@node1 ~]# cat /etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 127.0.1.1 node1 192.168.1.216 node1 192.168.1.217 node2 [root@node1 ~]# Update: I have gotten this to work with LVM following this guide exactly, so I think my issue actually lies with the following lines of code. But for now I think i will stick with LVM since it works, unless somebody else really wants to work on this. (My working LVM writeup) device /dev/drbd0; or device /dev/drbd0; The reason I say this, is I used the same hosts/hostname/shortname/ip_addr but LVM and it worked, but maybe I missed something the first time, I fixed in my new VM Template (I started from scratch to build LVM)
FreeSoftwareServers (2682 rep)
May 1, 2016, 01:59 AM • Last activity: Mar 8, 2023, 02:50 AM
4 votes
1 answers
1660 views
Clustered NFS server reply ERR 24: Auth Bogus Credentials (seal broken)
I have 4 servers on the VirtualBox. Two of the servers are a CentOS 7 cluster with Pacemaker(corosync), and they have an NFSv4 server in Active/Passive mode. There are also 2 clients with CentOS 6, also using this NFS server. The problem does not always occur, but sometimes when I manually or automa...
I have 4 servers on the VirtualBox. Two of the servers are a CentOS 7 cluster with Pacemaker(corosync), and they have an NFSv4 server in Active/Passive mode. There are also 2 clients with CentOS 6, also using this NFS server. The problem does not always occur, but sometimes when I manually or automatically failover from the active NFS server cluster, both clients give the error: *Permission denied.* The tcpdump from the clients shows: [17:24:29.271467] IP client.example.net.34236755563 > server.example.net.nfs 112 getattr [|nfs] [17:24:29.271619] IP server.example.net.nfs > client.example.net.3423675563: reply ERR 24: Auth Bogus Credentials (seal broken) Until this problem is solved nothing is working: I have tried to transfer to NFSv3, tried different cluster configurations, tried a grace period for NFSv4 from 10 to 90 seconds, with no luck. Cluster configuration: node 1: storage1 node 2: storage2 primitive p_drbd_nfs ocf:linbit:drbd \ params drbd_resource=cgp \ op monitor interval=31s role=Master \ op monitor interval=29s role=Slave \ op start interval=0 timeout=240s \ op stop interval=0 timeout=120s primitive p_fs_home Filesystem \ params device="/dev/drbd0" directory="/mnt" fstype=xfs options="noatime,nobarrier" \ op monitor interval=10s \ meta is-managed=true primitive p_ip_nfs IPaddr2 \ params ip=192.168.56.100 cidr_netmask=24 \ op monitor interval=30s \ meta is-managed=true primitive p_nfs_exports exportfs \ params fsid=0 directory="/mnt" options="rw,async,no_wdelay,mountpoint,insecure,no_subtree_check,no_root_squash" clientspec="192.168.56.0/255.255.255.0" wait_for_leasetime_on_stop=true rmtab_backup=none \ op monitor interval=10s \ op stop interval=0 timeout=120s \ meta is-managed=true primitive p_nfsserver nfsserver \ params grace_time=90 proc_num=16 \ op monitor interval=30s \ meta is-managed=true primitive p_ping ocf:pacemaker:ping \ params host_list=192.168.56.1 multiplier=1000 attempts=1 timeout=3 name=p_ping \ op monitor interval=5 timeout=60 ms ms_drbd_nfs p_drbd_nfs \ meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true is-managed=true clone cl_p_ping p_ping \ meta is-managed=true target-role=Started location l_0 ms_drbd_nfs \ rule $role=Master -inf: not_defined p_ping or p_ping lte 0 colocation c_1 inf: p_fs_home ms_drbd_nfs:Master colocation c_2 inf: p_nfsserver p_fs_home colocation c_3 inf: p_nfs_exports p_nfsserver colocation c_4 inf: p_ip_nfs p_nfs_exports order o_1 inf: ms_drbd_nfs:promote p_fs_home:start order o_2 inf: p_fs_home p_nfsserver order o_3 inf: p_nfsserver p_nfs_exports order o_4 inf: p_nfs_exports p_ip_nfs property cib-bootstrap-options: \ dc-version=1.1.10-32.el7_0.1-368c726 \ cluster-infrastructure=corosync \ stonith-enabled=false \ no-quorum-policy=ignore \ last-lrm-refresh=1428329105 rsc_defaults rsc-options: \ resource-stickiness=200 Here is a string from the client fstab file: 192.168.56.100:/ /mnt nfs nfsvers=4,proto=tcp,rsize=32768,wsize=32768,hard,timeo=300,retrans=2,bg,actimeo=3,noatime,nodiratime 0 0
Max Karpenkov (41 rep)
Apr 9, 2015, 03:11 PM • Last activity: Jan 8, 2023, 02:02 AM
0 votes
1 answers
396 views
How to tell if a VG is clustered?
I have a CentOS 7 Pacemaker cluster with GFS2 Filesystrems mounted. I'm fairly certain that `vgchange -cy vg_name` was NOT run during setup. I tried running `vgchange --test -cy vg_name` and it tells me the volume group is already clustered. In Linux 6 `service clvmd status` will show if the vg is c...
I have a CentOS 7 Pacemaker cluster with GFS2 Filesystrems mounted. I'm fairly certain that vgchange -cy vg_name was NOT run during setup. I tried running vgchange --test -cy vg_name and it tells me the volume group is already clustered. In Linux 6 service clvmd status will show if the vg is clustered or not. However on Linux 7 pcs resource show clvmd output is quite different and I'm not sure what to look for. pcs resource show clvmd Resource: clvmd (class=ocf provider=heartbeat type=clvm) Operations: monitor interval=30s on-fail=fence (clvmd-monitor-interval-30s) start interval=0s timeout=90s (clvmd-start-interval-0s) stop interval=0s timeout=90s (clvmd-stop-interval-0s) Would creating the filesystem resources have done the vgchange if needed? Is there anything else I can check?
ex_submariner (1 rep)
Sep 22, 2022, 05:56 PM • Last activity: Oct 2, 2022, 02:25 AM
1 votes
1 answers
488 views
Where are libvirt's VM definitions "originals" stored, and how to sync them across multiple nodes?
Migrating from Xen's `xm` to Xen's `xl` under control of libvirt, I wonder: Where does libvirt store the "originals" of VM configurations? I found that my PVM configurations are stored in `/etc/libvirt/libxl/`, but when viewing such files I see a comment saying that file has been created automatical...
Migrating from Xen's xm to Xen's xl under control of libvirt, I wonder: Where does libvirt store the "originals" of VM configurations? I found that my PVM configurations are stored in /etc/libvirt/libxl/, but when viewing such files I see a comment saying that file has been created automatically and I should not be edited ("use virsh edit ..."). I also found XML and JSON files in /var/lib/xen, named after the Domain ID and UUID of the VM. As I'm configuring a HA cluster, I'd like to synchronize VM configurations across all cluster nodes (allowing live-migration). In the past syncing /etc/xen/vm was enough, but for libvirt it seems to be much more complicated: Sometimes I'll have to virsh define a VM from the XML file, virsh destroy does not only destroy the running VM, but also the configuration it seems, and virsh undefine also removed the XML file in /etc/libvirt/libxl/ it seems. I don't know how to synchronize the configuration across the cluster nodes. The major problem I see is this: After csync2-ing the XML files defining the VM configurations to the other cluster nodes, I see the changes in the /etc/libvirt/libxl/ files, saying "do not edit; use virsh edit instead". However when I use virsh edit for one of those files, the contents I see in the editor is not what I see in the XML files located in /etc/libvirt/libxl/. Maybe re-phrase the question to: *If I update the XML files in /etc/libvirt/libxl/ (like via csync2), how can I ensure that libvirt uses the updated configurations?* Update --- This question became more important after I had added a block device for paging using xl block-attach corresponding to the edited configuration. When the VM was live-migrated to another node in the cluster, the added disk was not transferred to the VM, so the VM froze when trying to access that disk. So obviously the configuration of the current machine was not used for live-migration, and the saved configuration in the XML files weren't used either.
U. Windl (1715 rep)
Feb 17, 2021, 12:08 PM • Last activity: Sep 8, 2022, 09:52 AM
0 votes
1 answers
200 views
Convert puppet manifest config to hiera
I installed corosync-pacemaker cluster via puppet. Now I would to like keep my data into hiera file. How should I convert cs_primitive section into yaml file? cs_primitive { 'nfsshare_fs': primitive_class => 'ocf', primitive_type => 'Filesystem', provided_by => 'heartbeat', parameters => { 'device'...
I installed corosync-pacemaker cluster via puppet. Now I would to like keep my data into hiera file. How should I convert cs_primitive section into yaml file? cs_primitive { 'nfsshare_fs': primitive_class => 'ocf', primitive_type => 'Filesystem', provided_by => 'heartbeat', parameters => { 'device' => '/dev/disk/lvname', 'directory' => '/share', 'fstype' => 'ext4' }, }-> I tried the below code but it didn't work. corosync::cs_primitive: 'nfsshare_fs': primitive_class: 'ocf' primitive_type: 'Filesystem' provided_by: 'heartbeat' parameters: device: '/dev/disk/by-id/lvname' directory: '/share' fstype: 'ext4' Thanks.
fortunate1357 (1 rep)
Apr 4, 2022, 06:21 PM • Last activity: Jul 14, 2022, 07:27 PM
1 votes
1 answers
1468 views
Debian 10 Pacemaker-Cluster: GFS2 Mount fails because of "Global lock failed: check that global lockspace is started."
I'm trying to setup a new Debian 10 cluster with three instances. My stack is based on pacemaker, corosync, dlm, and lvmlockd with a GFS2 volume. All servers have access to the GFS2 volume but I can't mount it with pacemaker or manually when using the GFS2 filesystem. I configured corosync and all t...
I'm trying to setup a new Debian 10 cluster with three instances. My stack is based on pacemaker, corosync, dlm, and lvmlockd with a GFS2 volume. All servers have access to the GFS2 volume but I can't mount it with pacemaker or manually when using the GFS2 filesystem. I configured corosync and all three instances are online. I continued with dlm and lvm configuration. Here my configuration steps for LVM and pacemaker: LVM: sudo nano /etc/lvm/lvm.conf --> Set locking_type = 1 and use_lvmlockd = 1 Pacemaker Resources: sudo pcs -f stonith_cfg stonith create meatware meatware hostlist="firmwaredroid-swarm-1 firmwaredroid-swarm-2 firmwaredroid-swarm-3" op monitor interval=60s sudo pcs resource create dlm ocf:pacemaker:controld \ op start timeout=90s interval=0 \ op stop timeout=100s interval=0 sudo pcs resource create lvmlockd ocf:heartbeat:lvmlockd \ op start timeout=90s interval=0 \ op stop timeout=100s interval=0 sudo pcs resource group add base-group dlm lvmlockd sudo pcs resource clone base-group \ meta interleave=true ordered=true target-role=Started The pcs status shows that all resources are up and online. After the pacemaker configuration I tried to setup a shared Volume Group to add the Filesystem resource to pacemaker but all the commands fail with Global lock failed: check that global lockspace is started. sudo pvcreate /dev/vdb --> Global lock failed: check that global lockspace is started sudo vgcreate vgGFS2 /dev/vdb —shared --> Global lock failed: check that global lockspace is started I then tried to directly format the /dev/vdb with mkfs.gfs2 which works but seems to me a step in the wrong direction, because mounting the volume then always fails: sudo mkfs.gfs2 -p lock_dlm -t firmwaredroidcluster:gfsvolfs -j 3 /dev/gfs2share/lvGfs2Share sudo mount -v -t "gfs2" /dev/vdb ./swarm_file_mount/ mount: /home/debian/swarm_file_mount: mount(2) system call failed: Transport endpoint is not connected. I tried several configurations like starting lvmlockd -g dlm or debugging dlm with dlm_controld -d but I don't find any infos on how to do it. On the web I found some RedHat forums that discuss similar errors but do not provide any solutions due to a paywall. How can I start or initialise the global lock with dlm so that I can mount the GFS2 correctly on the pacemaker Debian cluster? Or in other words what's wrong with my dlm configuration? Thx for any help!
Me7e0r (11 rep)
Jun 23, 2021, 10:53 AM • Last activity: Jul 12, 2021, 03:37 PM
0 votes
0 answers
459 views
HA-Cluster / corosync / pacemaker: Active-Active cluster with service ip / service ip is not switching
How to configure crm to migrate the ServiceIP if one Service is failed? node 1: web01a \ attributes standby=off node 2: web01b \ attributes standby=off primitive Apache2 systemd:apache2 \ operations $id=Apache2-operations \ op start interval=0 timeout=100 \ op stop interval=0 timeout=100 \ op monito...
How to configure crm to migrate the ServiceIP if one Service is failed? node 1: web01a \ attributes standby=off node 2: web01b \ attributes standby=off primitive Apache2 systemd:apache2 \ operations $id=Apache2-operations \ op start interval=0 timeout=100 \ op stop interval=0 timeout=100 \ op monitor interval=15 timeout=100 start-delay=15 \ meta primitive PHP-FPM systemd:php7.4-fpm \ operations $id=PHP-FPM-operations \ op start interval=0 timeout=100 \ op stop interval=0 timeout=100 \ op monitor interval=15 timeout=100 start-delay=15 \ meta primitive Redis systemd:redis-server \ operations $id=Redis-operations \ op start interval=0 timeout=100 \ op stop interval=0 timeout=100 \ op monitor interval=15 timeout=100 start-delay=15 \ meta primitive ServiceIP IPaddr2 \ params ip=1.2.3.4 \ operations $id=ServiceIP-operations \ op monitor interval=10 timeout=20 start-delay=0 \ op_params migration-threshold=1 \ meta primitive lsyncd systemd:lsyncd \ op start interval=0 timeout=100 \ op stop interval=0 timeout=100 \ op monitor interval=15 timeout=100 start-delay=15 \ meta target-role=Started group ActiveNode ServiceIP lsyncd group WebServer Apache2 PHP-FPM Redis clone cl_WS WebServer \ meta clone-max=2 notify=true interleave=true colocation col_cl_WS_ActiveNode 100: cl_WS ActiveNode property cib-bootstrap-options: \ have-watchdog=false \ dc-version=2.0.3-4b1f869f0f \ cluster-infrastructure=corosync \ cluster-name=debian \ stonith-enabled=false \ no-quorum-policy=ignore \ startup-fencing=false \ maintenance-mode=false \ last-lrm-refresh=1622628525 \ start-failure-is-fatal=true These services should always be started - Apache2 - PHP-FPM - Redis If one of these services is not running, the node is unhelthy. The **ServiceIP** and **lsyncd** should switch to an healthy node. When I killed the apache2 process, the IP is not switched.
FaxMax (726 rep)
Jun 2, 2021, 12:29 PM
1 votes
0 answers
143 views
Stop a pacemaker node when local shell script returns an error?
Is it possible to make pacemaker stopping a node when a local test script fails, and start a node if the local test script returns true again? This seems like a very simple problem, but as i can't find ANY way to do this within pacemaker, I'm about to run the following shell script on all my nodes:...
Is it possible to make pacemaker stopping a node when a local test script fails, and start a node if the local test script returns true again? This seems like a very simple problem, but as i can't find ANY way to do this within pacemaker, I'm about to run the following shell script on all my nodes: while true; do pcs status 2>/dev/null >/dev/null && node_running=true /is_node_healthy.sh && node_healthy=true [[ -v node_running ]] && ! [[ -v node_healthy ]] && pcs cluster stop [[ -v node_healthy ]] && ! [[ -v node_running ]] && pcs cluster start unset node_running node_healthy sleep 10 done This does exactly what i want, but looks like a very dirty hack in my eyes. Is there a more elegant way to get the same thing done by pacemaker itself? BTW: The overall task i want to solve seems quite simple: create a ha cluster that has a public ip address assigned to a vital host, where vitality can be checked with /is_node_healthy.sh
psicolor (11 rep)
Feb 22, 2021, 11:54 AM
1 votes
1 answers
393 views
fence_virtualbox failed to reboot
I’m learning how to fence pacemaker using fence_virtualbox from [\[ClusterLabs\] Fence agent for VirtualBox][1], but I can’t get it working. When I try to run `stonith_admin –-reboot ` it failed. Currently, my setup is: Node ID: VM name: orcllinux1 OL7 orcllinux2 OL7_2 I set it up using: `pcs stonit...
I’m learning how to fence pacemaker using fence_virtualbox from [\[ClusterLabs\] Fence agent for VirtualBox][1] , but I can’t get it working. When I try to run stonith_admin –-reboot it failed. Currently, my setup is: Node ID: VM name: orcllinux1 OL7 orcllinux2 OL7_2 I set it up using: pcs stonith create fence_vbox fence_virtualbox pcmk_host_map=”orcllinux1:OL7,orcllinux2:OL7_2” pcmk_host_list=”orcllinux1,orcllinux2” pcmk_host_check=static_list ipaddr=”192.168.57.1” login=”root” But stonith_admin –-reboot resulting in this error: Error I tried to use the fence_virtualbox manually using: fence_virtualbox -s 192.168.57.1 -p OL7 -o=reboot and it succeeded. Is my stonith create syntax wrong? What's the right syntax if it's wrong?
Christophorus Reyhan (33 rep)
Jan 8, 2021, 11:16 AM • Last activity: Feb 16, 2021, 03:51 AM
2 votes
1 answers
5355 views
Pacemaker - Corosync - HA - Simple Custom Resource Testing - Status flapping - Started - Failed - Stopped - Started
I am testing using the OCF:Heartbeat:Dummy script and I want to make a very basic setup just to know it works and build on that. The only information I can find was this web blog here. https://raymii.org/s/tutorials/Corosync_Pacemaker_-_Execute_a_script_on_failover.html It has some typos but basical...
I am testing using the OCF:Heartbeat:Dummy script and I want to make a very basic setup just to know it works and build on that. The only information I can find was this web blog here. https://raymii.org/s/tutorials/Corosync_Pacemaker_-_Execute_a_script_on_failover.html It has some typos but basically worked for me. The script currently just contains the following : sudo nano /usr/local/bin/failover.sh && sudo chmod +x /usr/local/bin/failover.sh #!/bin/sh touch /tmp/testfailover.sh Here is my setup : cp /usr/lib/ocf/resource.d/heartbeat/Dummy /usr/lib/ocf/resource.d/heartbeat/FailOverScript sudo nano /usr/lib/ocf/resource.d/heartbeat/FailOverScript dummy_start() { dummy_monitor /usr/local/bin/failover.sh if [ $? = $OCF_SUCCESS ]; then return $OCF_SUCCESS fi touch ${OCF_RESKEY_state} } sed -i 's/Dummy/FailOverScript/g' /usr/lib/ocf/resource.d/heartbeat/FailOverScript sed -i 's/dummy/FailOverScript/g' /usr/lib/ocf/resource.d/heartbeat/FailOverScript pcs resource create FailOverScript ocf:heartbeat:FailOverScript op monitor interval="30" The only testing I can really do : [root@node2 ~]# /usr/lib/ocf/resource.d/heartbeat/FailOverScript start ; echo $? DEBUG: default start : 0 0 ocf-tester doesn't seem to exist in the latest HA Software Suite, not really sure how to manually install it, but the script "half works". **The script doesn't need monitoring, its supposed to be very basic, but it seems to be flapping and giving me the following error code. Any idea's what to do?** FailOverScript (ocf::heartbeat:FailOverScript): Started node2 Failed Actions: * FailOverScript_monitor_30000 on node2 'not running' (7): call= 24423, status=complete, exitreason='none', last-rc-change='Tue Aug 16 15:53:50 2016', queued=0ms, exec= 9ms **Example of what I want to do:** Cluster start Script runs "start.sh" Cluster fails over to node2. On node1 script runs "fail.sh" On node2 script runs "start.sh" and vis versa if it fails the other direction. Note: The script does work, I get /tmp/testfailover.sh. I even tried putting another script under dummy_stop to remove the file and that worked, but it just keeps flapping along removing/adding/removing/adding file and starting/failing/stoping/starting etc etc. Thanks for reading!
FreeSoftwareServers (2682 rep)
Aug 16, 2016, 07:56 PM • Last activity: Dec 21, 2020, 06:56 AM
0 votes
1 answers
1741 views
Pacemaker apache resource is Failed to access httpd status page after change to HTTPS
I get this error from pacemaker after i change apache from http to https. now my ocf::heartbeat:apache resource is not find status page. I generate SSL certificate separately for 3 servers. Everything was working fine when running on http but as soon as I added the (self-signed) SSL certificate pace...
I get this error from pacemaker after i change apache from http to https. now my ocf::heartbeat:apache resource is not find status page. I generate SSL certificate separately for 3 servers. Everything was working fine when running on http but as soon as I added the (self-signed) SSL certificate pacemaker Apache (ocf::heartbeat:apache): Stopped And error shows Failed Actions: * Apache_start_0 on server3 'unknown error' (1): call=315, status=complete, exitreason='Failed to access httpd status page.', last-rc-change='Mon Sep 21 16:22:37 2020', queued=0ms, exec=3456ms * Apache_start_0 on server1 'unknown error' (1): call=59, status=complete, exitreason='Failed to access httpd status page.', last-rc-change='Mon Sep 21 16:22:41 2020', queued=0ms, exec=3421ms * Apache_start_0 on server2 'unknown error' (1): call=197, status=complete, exitreason='Failed to access httpd status page.', last-rc-change='Mon Sep 21 16:22:33 2020', queued=0ms, exec=3451ms /etc/apache2/sites-available/000-default.conf ServerAdmin webmaster@localhost DocumentRoot /var/www/html Redirect "/" "https://10.226.***.***/ " SetHandler server-status ServerAdmin webmaster@localhost DocumentRoot /var/www/html Redirect "/" "https://10.226.179.205/ " Order deny,allow Deny from all Allow from 127.0.0.1 *pcs resource debug-monitor --full Apache* Operation monitor for Apache (ocf:heartbeat:apache) returned 1 > stderr: + echo > stderr: + printenv > stderr: + sort > stderr: + env= > stderr: AONIX_LM_DIR=/home/TeleUSE/etc > stderr: BXwidgets=/home/BXwidgets > stderr: HA_logfacility=none > stderr: HOME=/root > stderr: LC_ALL=C > stderr: LOGNAME=root > stderr: MAIL=/var/mail/root > stderr: OCF_EXIT_REASON_PREFIX=ocf-exit-reason: > stderr: OCF_RA_VERSION_MAJOR=1 > stderr: OCF_RA_VERSION_MINOR=0 > stderr: OCF_RESKEY_CRM_meta_class=ocf > stderr: OCF_RESKEY_CRM_meta_id=Apache > stderr: OCF_RESKEY_CRM_meta_migration_threshold=5 > stderr: OCF_RESKEY_CRM_meta_provider=heartbeat > stderr: OCF_RESKEY_CRM_meta_resource_stickiness=10 > stderr: OCF_RESKEY_CRM_meta_type=apache > stderr: OCF_RESKEY_configfile=/etc/apache2/apache2.conf > stderr: OCF_RESKEY_statusurl=http://localhost/server-status > stderr: OCF_RESOURCE_INSTANCE=Apache > stderr: OCF_RESOURCE_PROVIDER=heartbeat > stderr: OCF_RESOURCE_TYPE=apache > stderr: OCF_ROOT=/usr/lib/ocf > stderr: OCF_TRACE_RA=1 > stderr: PATH=/root/.rbenv/shims:/root/.rbenv/bin:/root/.rbenv/shims:/root/.rbenv/bin:/usr/local/bin:/home/TeleUSE/bin:/home/xrt/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/sbin:/bin:/usr/sbin:/usr/bin:/usr/ucb > stderr: PCMK_logfacility=none > stderr: PCMK_service=crm_resource > stderr: PWD=/root > stderr: RBENV_SHELL=bash > stderr: SHELL=/bin/bash > stderr: SHLVL=1 > stderr: SSH_CLIENT=10.12.116.46 63097 22 > stderr: SSH_CONNECTION=10.12.116.46 63097 10.226.179.205 22 > stderr: SSH_TTY=/dev/pts/0 > stderr: TERM=xterm > stderr: TeleUSE=/home/TeleUSE > stderr: USER=root > stderr: _=/usr/sbin/pcs > stderr: __OCF_TRC_DEST= > stderr: __OCF_TRC_MANAGE= > stderr: + ocf_is_true > stderr: + false > stderr: + . /usr/lib/ocf/lib/heartbeat/apache-conf.sh > stderr: + . /usr/lib/ocf/lib/heartbeat/http-mon.sh > stderr: + bind_address=127.0.0.1 > stderr: + curl_ipv6_opts= > stderr: + ocf_is_true > stderr: + false > stderr: + echo > stderr: + grep -qs :: > stderr: + WGETOPTS=-O- -q -L --no-proxy --bind-address=127.0.0.1 > stderr: + CURLOPTS=-o - -Ss -L --interface lo > stderr: + HA_VARRUNDIR=/var/run > stderr: + IBMHTTPD=/opt/IBMHTTPServer/bin/httpd > stderr: + HTTPDLIST=/sbin/httpd2 /usr/sbin/httpd2 /usr/sbin/apache2 /sbin/httpd /usr/sbin/httpd /usr/sbin/apache /opt/IBMHTTPServer/bin/httpd > stderr: + MPM=/usr/share/apache2/find_mpm > stderr: + [ -x /usr/share/apache2/find_mpm ] > stderr: + LOCALHOST=http://localhost > stderr: + HTTPDOPTS=-DSTATUS > stderr: + DEFAULT_IBMCONFIG=/opt/IBMHTTPServer/conf/httpd.conf > stderr: + DEFAULT_SUSECONFIG=/etc/apache2/httpd.conf > stderr: + DEFAULT_RHELCONFIG=/etc/httpd/conf/httpd.conf > stderr: + DEFAULT_DEBIANCONFIG=/etc/apache2/apache2.conf > stderr: + basename /usr/lib/ocf/resource.d/heartbeat/apache > stderr: + CMD=apache > stderr: + OCF_REQUIRED_PARAMS= > stderr: + OCF_REQUIRED_BINARIES= > stderr: + ocf_rarun monitor > stderr: + mk_action_func > stderr: + echo apache_monitor > stderr: + tr - _ > stderr: + ACTION_FUNC=apache_monitor > stderr: + validate_args > stderr: + is_function apache_monitor > stderr: + command -v apache_monitor > stderr: + test zapache_monitor = zapache_monitor > stderr: + simple_actions > stderr: + check_required_params > stderr: + local v > stderr: + run_function apache_getconfig > stderr: + is_function apache_getconfig > stderr: + command -v apache_getconfig > stderr: + test zapache_getconfig = zapache_getconfig > stderr: + apache_getconfig > stderr: + HTTPD= > stderr: + PORT= > stderr: + STATUSURL=http://localhost/server-status > stderr: + CONFIGFILE=/etc/apache2/apache2.conf > stderr: + OPTIONS= > stderr: + CLIENT= > stderr: + TESTREGEX= > stderr: + TESTURL= > stderr: + TESTREGEX10= > stderr: + TESTCONFFILE= > stderr: + TESTNAME= > stderr: + : /etc/apache2/envvars > stderr: + source_envfiles /etc/apache2/envvars > stderr: + [ -f /etc/apache2/envvars -a -r /etc/apache2/envvars ] > stderr: + . /etc/apache2/envvars > stderr: + unset HOME > stderr: + [ != ] > stderr: + SUFFIX= > stderr: + export APACHE_RUN_USER=www-data > stderr: + export APACHE_RUN_GROUP=www-data > stderr: + export APACHE_PID_FILE=/var/run/apache2/apache2.pid > stderr: + export APACHE_RUN_DIR=/var/run/apache2 > stderr: + export APACHE_LOCK_DIR=/var/lock/apache2 > stderr: + export APACHE_LOG_DIR=/var/log/apache2 > stderr: + export LANG=C > stderr: + export LANG > stderr: + [ X = X -o ! -f -o ! -x ] > stderr: + find_httpd_prog > stderr: + HTTPD= > stderr: + [ -f /sbin/httpd2 -a -x /sbin/httpd2 ] > stderr: + [ -f /usr/sbin/httpd2 -a -x /usr/sbin/httpd2 ] > stderr: + [ -f /usr/sbin/apache2 -a -x /usr/sbin/apache2 ] > stderr: + HTTPD=/usr/sbin/apache2 > stderr: + break > stderr: + [ X != X -a X/usr/sbin/apache2 != X ] > stderr: + detect_default_config > stderr: + [ -f /etc/apache2/httpd.conf ] > stderr: + [ -f /etc/apache2/apache2.conf ] > stderr: + echo /etc/apache2/apache2.conf > stderr: + DefaultConfig=/etc/apache2/apache2.conf > stderr: + CONFIGFILE=/etc/apache2/apache2.conf > stderr: + [ -n /usr/sbin/apache2 ] > stderr: + basename /usr/sbin/apache2 > stderr: + httpd_basename=apache2 > stderr: + GetParams /etc/apache2/apache2.conf > stderr: + ConfigFile=/etc/apache2/apache2.conf > stderr: + [ ! -f /etc/apache2/apache2.conf ] > stderr: + get_apache_params /etc/apache2/apache2.conf ServerRoot PidFile Port Listen > stderr: + configfile=/etc/apache2/apache2.conf > stderr: + shift 1 > stderr: + echo ServerRoot PidFile Port Listen > stderr: + sed s/ /,/g > stderr: + vars=ServerRoot,PidFile,Port,Listen > stderr: + apachecat /etc/apache2/apache2.conf > stderr: + awk -v vars=ServerRoot,PidFile,Port,Listen > stderr: BEGIN{ > stderr: split(vars,v,","); > stderr: for( i in v ) > stderr: vl[i]=tolower(v[i]); > stderr: } > stderr: { > stderr: for( i in v ) > stderr: if( tolower($1)==vl[i] ) { > stderr: print v[i]"="$2 > stderr: delete vl[i] > stderr: break > stderr: } > stderr: } > stderr: > stderr: + awk > stderr: function procline() { > stderr: split($0,a); > stderr: if( a~/^[Ii]nclude$/ ) { > stderr: includedir=a; > stderr: gsub("\"","",includedir); > stderr: procinclude(includedir); > stderr: } else { > stderr: if( a=="ServerRoot" ) { > stderr: rootdir=a; > stderr: gsub("\"","",rootdir); > stderr: } > stderr: print; > stderr: } > stderr: } > stderr: function printfile(infile, a) { > stderr: while( (getline 0 ) { > stderr: procline(); > stderr: } > stderr: close(infile); > stderr: } > stderr: function allfiles(dir, cmd,f) { > stderr: cmd="find -L "dir" -type f"; > stderr: while( ( cmd | getline f ) > 0 ) { > stderr: printfile(f); > stderr: } > stderr: close(cmd); > stderr: } > stderr: function listfiles(pattern, cmd,f) { > stderr: cmd="ls "pattern" 2>/dev/null"; > stderr: while( ( cmd | getline f ) > 0 ) { > stderr: printfile(f); > stderr: } > stderr: close(cmd); > stderr: } > stderr: function procinclude(spec) { > stderr: if( rootdir!="" && spec!~/^\// ) { > stderr: spec=rootdir"/"spec; > stderr: } > stderr: if( isdir(spec) ) { > stderr: allfiles(spec); # read all files in a directory (and subdirs) > stderr: } else { > stderr: listfiles(spec); # there could be jokers > stderr: } > stderr: } > stderr: function isdir(s) { > stderr: return !system("test -d \""s"\""); > stderr: } > stderr: { procline(); } > stderr: /etc/apache2/apache2.conf > stderr: + sed s/#.*//;s/[[:blank:]]*$//;s/^[[:blank:]]*// > stderr: + grep -v ^$ > stderr: + eval PidFile=${APACHE_PID_FILE} > stderr: + PidFile=/var/run/apache2/apache2.pid > stderr: + CheckPort > stderr: + ocf_is_decimal > stderr: + false > stderr: + CheckPort > stderr: + ocfError performing operation: Operation not permitted _is_decimal > stderr: + false > stderr: + CheckPort 80 > stderr: + ocf_is_decimal 80 > stderr: + true > stderr: + [ 80 -gt 0 ] > stderr: + PORT=80 > stderr: + break > stderr: + echo > stderr: + grep : > stderr: + Listen=localhost: > stderr: + [ Xhttp://localhost/server-status = X ] > stderr: + test /var/run/apache2/apache2.pid > stderr: + return 0 > stderr: + validate_env > stderr: + check_required_binaries > stderr: + local v > stderr: + is_function apache_validate_all > stderr: + command -v apache_validate_all > stderr: + test zapache_validate_all = zapache_validate_all > stderr: + local rc > stderr: + LSB_STATUS_STOPPED=3 > stderr: + apache_validate_all > stderr: + [ -z /usr/sbin/apache2 ] > stderr: + [ ! -x /usr/sbin/apache2 ] > stderr: + [ ! -f /etc/apache2/apache2.conf ] > stderr: + [ -n ] > stderr: + [ -n ] > stderr: + dirname /var/run/apache2/apache2.pid > stderr: + local a > stderr: + local b > stderr: + [ 1 = 1 ] > stderr: + a=/var/run/apache2/apache2.pid > stderr: + [ 1 ] > stderr: + b=/var/run/apache2/apache2.pid > stderr: + [ /var/run/apache2/apache2.pid = /var/run/apache2/apache2.pid ] > stderr: + break > stderr: + b=/var/run/apache2 > stderr: + [ -z /var/run/apache2 -o /var/run/apache2/apache2.pid = /var/run/apache2 ] > stderr: + echo /var/run/apache2 > stderr: + return 0 > stderr: + ocf_mkstatedir root 755 /var/run/apache2 > stderr: + local owner > stderr: + local perms > stderr: + local path > stderr: + owner=root > stderr: + perms=755 > stderr: + path=/var/run/apache2 > stderr: + test -d /var/run/apache2 > stderr: + return 0 > stderr: + return 0 > stderr: + rc=0 > stderr: + [ 0 -ne 0 ] > stderr: + ocf_is_probe > stderr: + [ monitor = monitor -a 0 = 0 ] > stderr: + run_probe > stderr: + is_function apache_probe > stderr: + command -v apache_probe > stderr: + test z = zapache_probe > stderr: + shift 1 > stderr: + apache_monitor > stderr: + silent_status > stderr: + local pid > stderr: + get_pid > stderr: + [ -f /var/run/apache2/apache2.pid ] > stderr: + cat /var/run/apache2/apache2.pid > stderr: + pid=17552 > stderr: + [ -n 17552 ] > stderr: + ProcessRunning 17552 > stderr: + local pid=17552 > stderr: + [ -d /proc -a -d /proc/1 ] > stderr: + [ -d /proc/17552 ] > stderr: + [ 0 -ne 0 ] > stderr: + findhttpclient > stderr: + [ x != x ] > stderr: + which wget > stderr: + echo wget > stderr: + ourhttpclient=wget > stderr: + [ -z wget ] > stderr: + ocf_check_level 10 > stderr: + local lvl prev > stderr: + lvl=0 > stderr: + prev=0 > stderr: + ocf_is_decimal 0 > stderr: + true > stderr: + [ 10 -eq 0 ] > stderr: + [ 10 -gt 0 ] > stderr: + lvl=0 > stderr: + break > stderr: + echo 0 > stderr: + apache_monitor_basic > stderr: + wget_func http://localhost/server-status > stderr: + auth= > stderr: + cl_opts=-O- -q -L --no-proxy --bind-address=127.0.0.1 > stderr: + [ x !=+ x ] > stderr: grep+ wget -Ei -O- -q > stderr: -L --no-proxy --bind-address=127.0.0.1 http://localhost/server-status > stderr: + attempt_index_monitor_request > stderr: + local indexpage= > stderr: + [ -n ] > stderr: + [ -n ] > stderr: + [ -n ] > stderr: + [ -n http://localhost/server-status ] > stderr: + return 1 > stderr: + [ 1 -eq 0 ] > stderr: + ocf_is_probe > stderr: + [ monitor = monitor -a 0 = 0 ] > stderr: + return 1 **pcs config** Resource: MasterVip (class=ocf provider=heartbeat type=IPaddr2) Attributes: ip=10.226.***.*** nic=lo cidr_netmask=32 iflabel=pgrepvip Meta Attrs: target-role=Started Operations: start interval=0s timeout=20s (MasterVip-start-interval-0s) stop interval=0s timeout=20s (MasterVip-stop-interval-0s) monitor interval=90s (MasterVip-monitor-interval-90s) Resource: Apache (class=ocf provider=heartbeat type=apache) Attributes: configfile=/etc/apache2/apache2.conf statusurl=http://localhost/server-status Operations: start interval=0s timeout=40s (Apache-start-interval-0s) stop interval=0s timeout=60s (Apache-stop-interval-0s) monitor interval=1min (Apache-monitor-interval-1min) I don't know how to fix this. if anyone knows please help me.
Karippery (1 rep)
Sep 21, 2020, 03:04 PM • Last activity: Sep 22, 2020, 11:36 AM
Showing page 1 of 20 total questions