Sample Header Ad - 728x90

Failover using pcs causes DRBD disk get unavailable on source server "No such resource"

1 vote
0 answers
480 views
I am working on DRBD, PCS to run a 2 node cluster. With the config virtual_IP and DRBD disk works fine on first node. Then I test the failover with "pcs cluster stop" on master, the disk and virtual IP gets properly migrated to second node. However, on first node the disk becomes un-available. drbdadm status Error: cluster is not currently running on this node opt_disk: No such resource Command 'drbdsetup-84 status opt_disk' terminated with exit code 10 Configuration: Cluster Name: cluster_zmbx1 Corosync Nodes: host_1 host_2 Pacemaker Nodes: host_1 host_2 Resources: Master: Z_Root Meta Attrs: master-node-max=1 clone-max=2 notify=true master-max=1 clone-node-max=1 Resource: zroot (class=ocf provider=linbit type=drbd) Attributes: drbd_resource=opt_disk Operations: demote interval=0s timeout=90 (zroot-demote-interval-0s) monitor interval=30s (zroot-monitor-interval-30s) notify interval=0s timeout=90 (zroot-notify-interval-0s) promote interval=0s timeout=90 (zroot-promote-interval-0s) reload interval=0s timeout=30 (zroot-reload-interval-0s) start interval=0s timeout=240 (zroot-start-interval-0s) stop interval=0s timeout=100 (zroot-stop-interval-0s) Resource: z_fs (class=ocf provider=heartbeat type=Filesystem) Attributes: device=/dev/drbd0 directory=/opt/ fstype=ext4 options=noatime Operations: monitor interval=20s timeout=40s (z_fs-monitor-interval-20s) notify interval=0s timeout=60s (z_fs-notify-interval-0s) start interval=0s timeout=60s (z_fs-start-interval-0s) stop interval=0s timeout=60s (z_fs-stop-interval-0s) Resource: MailIP (class=ocf provider=heartbeat type=IPaddr2) Attributes: cidr_netmask=20 ip=10.64.200.21 nic=eth0 Operations: monitor interval=10s (MailIP-monitor-interval-10s) start interval=0s timeout=20s (MailIP-start-interval-0s) stop interval=0s timeout=20s (MailIP-stop-interval-0s) Stonith Devices: Fencing Levels: Location Constraints: Ordering Constraints: promote Z_Root then start z_fs (kind:Mandatory) start z_fs then start MailIP (kind:Mandatory) Colocation Constraints: z_fs with Z_Root (score:INFINITY) (with-rsc-role:Master) MailIP with z_fs (score:INFINITY) Ticket Constraints: Alerts: No alerts defined Resources Defaults: resource-stickiness: 200 Operations Defaults: No defaults set Cluster Properties: cluster-infrastructure: corosync cluster-name: cluster_zmbx1 dc-version: 1.1.19-8.el7_6.4-c3c624ea3d have-watchdog: false no-quorum-policy: ignore stonith-enabled: false Quorum: Options: auto_tie_breaker: 0 last_man_standing: 1 wait_for_all: 1 Logs on source host when failover happens: Jul 25 20:31:56 host_1 systemd: Stopping Pacemaker High Availability Cluster Manager... Jul 25 20:31:56 host_1 pacemakerd: notice: Caught 'Terminated' signal Jul 25 20:31:56 host_1 pacemakerd: notice: Shutting down Pacemaker Jul 25 20:31:56 host_1 pacemakerd: notice: Stopping crmd Jul 25 20:31:56 host_1 crmd: notice: Caught 'Terminated' signal Jul 25 20:31:56 host_1 crmd: notice: Shutting down cluster resource manager Jul 25 20:31:56 host_1 crmd: notice: State transition S_IDLE -> S_POLICY_ENGINE Jul 25 20:31:56 host_1 pengine: notice: On loss of CCM Quorum: Ignore Jul 25 20:31:56 host_1 pengine: notice: Scheduling Node host_1 for shutdown Jul 25 20:31:56 host_1 pengine: notice: * Shutdown host_1 Jul 25 20:31:56 host_1 pengine: notice: * Promote zroot:0 ( Slave -> Master host_2 ) Jul 25 20:31:56 host_1 pengine: notice: * Stop zroot:1 ( Master host_1 ) due to node availability Jul 25 20:31:56 host_1 pengine: notice: * Move z_fs ( host_1 -> host_2 ) Jul 25 20:31:56 host_1 pengine: notice: * Move MailIP ( host_1 -> host_2 ) Jul 25 20:31:56 host_1 pengine: notice: Calculated transition 4, saving inputs in /var/lib/pacemaker/pengine/pe-input-3930.bz2 Jul 25 20:31:56 host_1 crmd: notice: Initiating cancel operation zroot_monitor_30000 on host_2 Jul 25 20:31:56 host_1 crmd: notice: Initiating stop operation MailIP_stop_0 locally on host_1 Jul 25 20:31:56 host_1 crmd: notice: Initiating notify operation zroot_pre_notify_demote_0 on host_2 Jul 25 20:31:56 host_1 crmd: notice: Initiating notify operation zroot_pre_notify_demote_0 locally on host_1 Jul 25 20:31:56 host_1 crmd: notice: Result of notify operation for zroot on host_1: 0 (ok) Jul 25 20:31:56 host_1 IPaddr2(MailIP): INFO: IP status = ok, IP_CIP= Jul 25 20:31:56 host_1 crmd: notice: Result of stop operation for MailIP on host_1: 0 (ok) Jul 25 20:31:56 host_1 crmd: notice: Initiating stop operation z_fs_stop_0 locally on host_1 Jul 25 20:31:56 host_1 Filesystem(z_fs): INFO: Running stop for /dev/drbd0 on /opt Jul 25 20:31:56 host_1 Filesystem(z_fs): INFO: Trying to unmount /opt Jul 25 20:31:56 host_1 Filesystem(z_fs): INFO: unmounted /opt successfully Jul 25 20:31:56 host_1 crmd: notice: Result of stop operation for z_fs on host_1: 0 (ok) Jul 25 20:31:56 host_1 crmd: notice: Initiating demote operation zroot_demote_0 locally on host_1 Jul 25 20:31:56 host_1 kernel: block drbd0: role( Primary -> Secondary ) Jul 25 20:31:56 host_1 kernel: block drbd0: 0 KB (0 bits) marked out-of-sync by on disk bit-map. Jul 25 20:31:56 host_1 crmd: notice: Result of demote operation for zroot on host_1: 0 (ok) Jul 25 20:31:56 host_1 crmd: notice: Initiating notify operation zroot_post_notify_demote_0 on host_2 Jul 25 20:31:56 host_1 crmd: notice: Initiating notify operation zroot_post_notify_demote_0 locally on host_1 Jul 25 20:31:56 host_1 crmd: notice: Result of notify operation for zroot on host_1: 0 (ok) Jul 25 20:31:56 host_1 crmd: notice: Initiating notify operation zroot_pre_notify_stop_0 on host_2 Jul 25 20:31:56 host_1 crmd: notice: Initiating notify operation zroot_pre_notify_stop_0 locally on host_1 Jul 25 20:31:56 host_1 crmd: notice: Result of notify operation for zroot on host_1: 0 (ok) Jul 25 20:31:56 host_1 crmd: notice: Initiating stop operation zroot_stop_0 locally on host_1 Jul 25 20:31:56 host_1 kernel: drbd opt_disk: peer( Secondary -> Unknown ) conn( Connected -> Disconnecting ) pdsk( UpToDate -> DUnknown ) Jul 25 20:31:56 host_1 kernel: drbd opt_disk: ack_receiver terminated Jul 25 20:31:56 host_1 kernel: drbd opt_disk: Terminating drbd_a_opt_disk Jul 25 20:31:56 host_1 kernel: drbd opt_disk: Connection closed Jul 25 20:31:56 host_1 kernel: drbd opt_disk: conn( Disconnecting -> StandAlone ) Jul 25 20:31:56 host_1 kernel: drbd opt_disk: receiver terminated Jul 25 20:31:56 host_1 kernel: drbd opt_disk: Terminating drbd_r_opt_disk Jul 25 20:31:56 host_1 kernel: block drbd0: disk( UpToDate -> Failed ) Jul 25 20:31:56 host_1 kernel: block drbd0: 0 KB (0 bits) marked out-of-sync by on disk bit-map. Jul 25 20:31:56 host_1 kernel: block drbd0: disk( Failed -> Diskless ) Jul 25 20:31:56 host_1 kernel: drbd opt_disk: Terminating drbd_w_opt_disk Jul 25 20:31:56 host_1 crmd: notice: Transition aborted by deletion of nvpair[@id='status-1-master-zroot']: Transient attribute change Jul 25 20:31:56 host_1 crmd: notice: Result of stop operation for zroot on host_1: 0 (ok) Jul 25 20:31:56 host_1 crmd: notice: Initiating notify operation zroot_post_notify_stop_0 on host_2 Jul 25 20:31:56 host_1 crmd: notice: Transition 4 (Complete=25, Pending=0, Fired=0, Skipped=2, Incomplete=13, Source=/var/lib/pacemaker/pengine/pe-input-3930.bz2): Stopped Jul 25 20:31:56 host_1 pengine: notice: On loss of CCM Quorum: Ignore Jul 25 20:31:56 host_1 pengine: notice: Scheduling Node host_1 for shutdown Jul 25 20:31:56 host_1 pengine: notice: * Shutdown host_1 Jul 25 20:31:56 host_1 pengine: notice: * Promote zroot:0 ( Slave -> Master host_2 ) Jul 25 20:31:56 host_1 pengine: notice: * Start z_fs ( host_2 ) Jul 25 20:31:56 host_1 pengine: notice: * Start MailIP ( host_2 ) Jul 25 20:31:56 host_1 pengine: notice: Calculated transition 5, saving inputs in /var/lib/pacemaker/pengine/pe-input-3931.bz2 Jul 25 20:31:56 host_1 crmd: notice: Initiating notify operation zroot_pre_notify_promote_0 on host_2 Jul 25 20:31:57 host_1 crmd: notice: Initiating promote operation zroot_promote_0 on host_2 Jul 25 20:31:57 host_1 crmd: notice: Initiating notify operation zroot_post_notify_promote_0 on host_2 Jul 25 20:31:57 host_1 crmd: notice: Transition aborted by status-2-master-zroot doing modify master-zroot=10000: Transient attribute change Jul 25 20:31:57 host_1 crmd: notice: Transition 5 (Complete=10, Pending=0, Fired=0, Skipped=1, Incomplete=4, Source=/var/lib/pacemaker/pengine/pe-input-3931.bz2): Stopped
Asked by irfan (11 rep)
Jul 25, 2019, 07:10 PM