Unix & Linux Stack Exchange

Q&A for users of Linux, FreeBSD and other Unix-like operating systems

Latest Questions

2 votes

1 answers

2579 views

After failover Pacemaker moves resource back when node comes back

I'm using Pacemaker & Corosync for my cluster. When a node dies pacemaker moving my resources to another online node. Everything ok here. But when the dead node comes back, Pacemaker moving the resource back. I don't have any "location" line in my config and also I tried with "unmove" command but no...

                                  I'm using Pacemaker & Corosync for my cluster.
When a node dies pacemaker moving my resources to another online node. Everything ok here.
But when the dead node comes back, Pacemaker moving the resource back.
I don't have any "location" line in my config and also I tried with "unmove" command but nothing changed.

I failed at somewhere and need to find the reason. 

**crm configure sh**

    node 1: DEV1
    node 2: DEV2
    primitive poolip IPaddr2 \
    	params ip=10.1.60.33 nic=enp2s0f0 cidr_netmask=24 \
    	meta migration-threshold=2 target-role=Started \
    	op monitor interval=20 timeout=20 on-fail=restart
    primitive gui systemd:gui \
    	op monitor interval=20s \
    	meta target-role=Started
    primitive gui-ip IPaddr2 \
    	params ip=10.1.60.35 nic=enp2s0f0 cidr_netmask=24 \
    	meta migration-threshold=2 target-role=Started \
    	op monitor interval=20 timeout=20 on-fail=restart
    colocation cluster-gui inf: gui gui-ip
    order gui-after-ip Mandatory: gui-ip gui
    property cib-bootstrap-options: \
    	have-watchdog=false \
    	dc-version=2.0.0-1-8cf3fe749e \
    	cluster-infrastructure=corosync \
    	cluster-name=mycluster \
    	stonith-enabled=false \
    	no-quorum-policy=ignore \
    	last-lrm-refresh=1545920437
    rsc_defaults rsc-options: \
    	migration-threshold=10 \
    	resource-stickiness=100


**pcs resource defaults**

    migration-threshold=10
    resource-stickiness=100

**pcs resource show gui**

    Resource: gui (class=systemd type=gui)
     Meta Attrs: target-role=Started
     Operations: monitor interval=20s (gui-monitor-20s)


                                

Ozbit (439 rep)

Jan 2, 2019, 08:58 AM • Last activity: Jun 14, 2025, 09:07 PM

1 votes

0 answers

16 views

Ocfs2: link between cluster and device?

fstab cluster ocfs2

I am having 2 servers (Debian 12) that use a storage-disk (SD). Both see this SD as a device via fdisk. I have no details about the storage-device itself or the connection type - for me it is just a disk that is connected to 2 servers. To use the disk simultaneously I need to use a cluster filesyste...

                                  

I am having 2 servers (Debian 12) that use a storage-disk (SD).
Both see this SD as a device via fdisk. I have no details about the storage-device itself or the connection type - for me it is just a disk that is connected to 2 servers.

To use the disk simultaneously I need to use a cluster filesystem. That is not provided by the storage-device itself.
I am running Ocfs2 on both servers to get that.

    cluster:
            name = myshare
            heartbeat_mode = local
            node_count = 2
    
    node:
            cluster = myshare
            number = 1
            ip_port = 7777
            ip_address = 9.9.9.101
            name = S1
    
    node:
            cluster = myshare
            number = 2
            ip_port = 7777
            ip_address = 9.9.9.102
            name = S2

Both servers have this fstab.

    UUID=xxxx	/myshare		ocfs2	defaults	0	3

**It is working great - I can mount and use it on both servers.**

Now I try to get a 2nd SD up and running for both servers. It appears as a different device with fdisk.

So I am not sure how to do it - because I do not see a "logical connection" between the defined Ocfs2 cluster and the storage-device - beside the fact that is is mounted as an ocfs2 type on a server that is running an Ocfs2 node. Is that enought??

Is it enought to have anouther line in fstab with the other UUID and also the type ocfs2 to make it work (both SDs are controlled by the Ocfs2 cluster)?
                                

chris01 (869 rep)

Jun 5, 2025, 03:28 AM

0 votes

2 answers

3786 views

How to Remove caavg_private Properly on AIX?

aix cluster

I am trying to cleanup a server which had a PowerHA configuration. I have stopped cluster (`smitty clstop`) and removed resource groups. How do I remove the caavg_private properly? hdisk5 00cc90476e2a44dd caavg_private active # lsvg -l caavg_private caavg_private: LV NAME TYPE LPs PPs PVs LV STATE M...

                                  I am trying to cleanup a server which had a PowerHA configuration. I have stopped cluster (smitty clstop) and removed resource groups. How do I remove the caavg_private properly? 

    hdisk5 00cc90476e2a44dd caavg_private active

    # lsvg -l caavg_private
    caavg_private:
    LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
    caalv_private1 boot 1 1 1 closed/syncd N/A
    caalv_private2 boot 1 1 1 closed/syncd N/A
    caalv_private3 boot 4 4 1 open/syncd N/A
    powerha_crlv boot 1 1 1 closed/syncd N/A

    # clstat -o
    clstat - HACMP Cluster Status Monitor
    -------------------------------------
    Cluster:  (1591186363)
    Wed Apr 1 03:57:10 2020
    State: UP Nodes: 2
    SubState: STABLE
    Node: Node01 State: UP
    Interface: Node01 (0) Address: 10.x.x.x
    State: UP
    Node: Node02 State: UP
    Interface: Node02 (0) Address: 10.x.x.x
    State: UP

                                

RJ Gellangarin (11 rep)

Apr 1, 2020, 07:56 AM • Last activity: May 20, 2025, 03:00 AM

0 votes

1 answers

159 views

Does the size of a file system cluster have to be even bytes?

filesystems cluster

Basically, could we have a file system with odd byte size clusters? Why is everything even? Thanks

                                  Basically, could we have a file system with odd byte size clusters? Why is everything even? Thanks
                                

pushandpop (1446 rep)

Feb 4, 2022, 08:37 AM • Last activity: Feb 21, 2025, 01:44 PM

0 votes

0 answers

33 views

Adding a New Server to Existing Proxmox Cluster - Network Configuration and VM Communication

virtual-machine cluster proxmox migration

I’m looking for some guidance on expanding my Proxmox setup. Here’s my current setup and what I’m trying to achieve: Current Setup - I have a dedicated OVH server running Proxmox. - On this server, I have a pfSense VM that handles VPN access for employees to connect to the internal network. - The s...

                                  I’m looking for some guidance on expanding my Proxmox setup. Here’s my current setup and what I’m trying to achieve:

Current Setup​

 - I have a dedicated OVH server running Proxmox.
 - On this server, I have a pfSense VM that handles VPN access for
   employees to connect to the internal network.
 - The server has reached its capacity, and I need to add a new server
   to scale my infrastructure.

What I Want to Achieve​

 - Add a New Server: I want to install Proxmox on a new server and join
   it to the existing server to form a cluster.
 - VM Communication: After joining the cluster, I want VMs on the new
   server to be able to communicate with VMs on the old server.
 - Employee VPN Access: Employees should be able to access VMs on the
   new server via the existing pfSense VPN.

Questions​

 - **Cluster Setup:** Are there any specific considerations when joining the new server to the existing cluster to ensure seamless VM
   communication?
 - **pfSense and VPN:** Do I need to make any changes to pfSense (e.g., firewall rules, routing) to allow VPN access to VMs on the new
   server?
                                

Zakaria Ait Yakoub (11 rep)

Feb 18, 2025, 08:06 PM

0 votes

0 answers

23 views

Disable read-ahead caching for GFS2 Logical Volume

rhel performance cluster gfs

I have 10 node deployment which implement red hat clustering software - pacemaker/corosync to mount gfs2 and ensure high-availability. Nodes are actually mail servers and use gfs2 to store user's data in Maildir format. On each mail server i have the following gfs2 setup: root@mail ~# lsblk | grep g...

                                  I have 10 node deployment which implement red hat clustering software - pacemaker/corosync to mount gfs2 and ensure high-availability. Nodes are actually mail servers and use gfs2 to store user's data in Maildir format. On each mail server i have the following gfs2 setup:

    root@mail ~# lsblk | grep gfs2
      └─vg_1-lv_1   253:4    0  2T  0 lvm  /mnt/gfs2_1
      └─vg_2-lv_2   253:16   0  2T  0 lvm  /mnt/gfs2_2
      └─vg_3-lv_3   253:7    0  2T  0 lvm  /mnt/gfs2_3
      └─vg_4-lv_4   253:15   0  2T  0 lvm  /mnt/gfs2_4
      └─vg_5-lv_5   253:13   0  2T  0 lvm  /mnt/gfs2_5
      └─vg_6-lv_6   253:12   0  2T  0 lvm  /mnt/gfs2_6
      └─vg_7-lv_7   253:10   0  2T  0 lvm  /mnt/gfs2_7
      └─vg_8-lv_8   253:9    0  2T  0 lvm  /mnt/gfs2_8
      └─vg_9-lv_9   253:6    0  2T  0 lvm  /mnt/gfs2_9
      └─vg_10-lv_10 253:11   0  2T  0 lvm  /mnt/gfs2_10
     

Now, i am not very pleased with my performance and my thinking is to remove read-ahead local cache on my gfs2 devices. Yes gfs2 uses local cache to serve clients more quickly, but as this information is not synced across all nodes, and we are not forcing single user to a single server, i am not sure this makes sense in terms of helping performance. Also, we are using dlm to force flushing outdated cached data across nodes. With all this being said, i am still not sure if this is the right move, and i am looking for an advice. 

1) Is my thinking right - will this improve my fs performance or quite contrary?
2) Do you have any other advice that would improve my performance?

Thank you in advance.
                                

brchelli26 (1 rep)

Feb 17, 2025, 08:29 AM • Last activity: Feb 17, 2025, 08:45 AM

1 votes

1 answers

1934 views

How can I submit multiple R job at once?

shell-script cluster r qsub

I have a R-script which runs multiple files say file=1 to 50. I usually submit repeated jobs say 5 times with 10 files each time by changing the number in R-script. So, how can I submit the 5-job at once without submitting the job 5 times? In addition, I want to update the **default.out** and **erro...

#!/bin/bash

#PBS -l nodes=1:ppn=20,walltime=05:00:00

#PBS -m e
#PBS -o default.out
#PBS -e errorfile

module load R/4.0

Rscript ~/r_script1.R

b_takhel (21 rep)

May 22, 2021, 07:54 PM • Last activity: Feb 13, 2025, 02:08 PM

0 votes

0 answers

22 views

How to set new features to N during kernel compilation from an old .config file?

linux-kernel kernel compiling cluster high-performance

I am compiling a custom linux kernel for a compute cluster. The cluster is currently running on kernel version 4.4.47 since last 5 years. I need to upgrade the kernel to a more recent version. I've chosen the version 6.6.76 since it has long term support. Now here is what I've tried: I have the old...

                                  I am compiling a custom linux kernel for a compute cluster. The cluster is currently running on kernel version 4.4.47 since last 5 years. I need to upgrade the kernel to a more recent version. I've chosen the version 6.6.76 since it has long term support. 

Now here is what I've tried: I have the old configuration file. So I just copy it and put it inside the kernel source tree's root as .config and then run make olddefconfig. This takes all the existing configuration and sets the default values for the newer options. However, this adds up several unwanted features that I find hard to disable manually. 

Is there a better way to do this such that the configuration takes all the old settings and sets newer settings to n?

Sâu (101 rep)

Feb 10, 2025, 12:37 PM

1 votes

1 answers

323 views

Need a method for managing systemd services across multiple hosts

systemd rhel cluster

I have six Linux servers running RHEL 8.6 - and need to ensure that one specific service is running at least one and at most one of those six servers. Does systemd support something like this? If not, is there an extension to systemd or another mechanism that provides the exactly-one-instance is run...

                                  I have six Linux servers running RHEL 8.6 - and need to ensure that one specific service is running at least one and at most one of those six servers.

Does systemd support something like this?
If not, is there an extension to systemd or another mechanism that provides the exactly-one-instance is running across a pool of servers?

The question here - https://unix.stackexchange.com/questions/224022/systemd-support-for-toggling-mutually-exclusive-services  - starts to get at the question, but it still seems limited to services running on a single host.

The Programmer (13 rep)

Sep 18, 2024, 05:34 PM • Last activity: Sep 19, 2024, 01:16 AM

28 votes

5 answers

53147 views

SLURM: Custom standard output name

bash cluster slurm

When running a SLURM job using `sbatch`, slurm produces a standard output file which looks like slurm-102432.out (slurm-jobid.out). I would like to customise this to (yyyymmddhhmmss-jobid-jobname.txt). How do I go about doing this? Or more generally, how do I include computed variables in the `sbatc...

                                  When running a SLURM job using sbatch, slurm produces a standard output file which looks like slurm-102432.out (slurm-jobid.out). I would like to customise this to (yyyymmddhhmmss-jobid-jobname.txt). How do I go about doing this?

Or more generally, how do I include computed variables in the sbatch argument -o?

I have tried the following in my script.sh

    #SBATCH -p core
    #SBATCH -n 6
    #SBATCH -t 1:00:00
    #SBATCH -J indexing
    #SBATCH -o "/home/user/slurm/$(date +%Y%m%d%H%M%S)-$(SLURM_JOB_ID)-indexing.txt"

but that did not work. The location of the file was correct in the new directory but the filename was just literal line $(date +%Y%m%d%H%M%S)-$(SLURM_JOB_ID)-indexing.txt.

So, I am looking for a way to save the standard output file in a directory /home/user/slurm/ with a filename like so: 20160526093322-10453-indexing.txt
                                

mindlessgreen (1349 rep)

May 26, 2016, 02:40 PM • Last activity: Aug 5, 2024, 09:14 AM

1 votes

0 answers

15 views

Script/Daemon to kill specific resource-consuming tools?

kill top cluster

I'm working on a SGE linux cluster and beginners often run memory/resource consuming tools on the login node instead of using `qsub` or `qlogin` ( https://gridscheduler.sourceforge.net/htmlman/htmlman1/qsub.html ) . Is there any tool or method to kill some specific programs (like [bwa][1]) if they h...

                                  I'm working on a SGE linux cluster and beginners often run memory/resource consuming tools on the login node instead of using qsub or qlogin ( https://gridscheduler.sourceforge.net/htmlman/htmlman1/qsub.html  ) .

Is there any tool or method to kill some specific programs (like bwa ) if they have been running on the login node for more than, say, 5 minutes  ?

Pierre (1803 rep)

Mar 28, 2024, 10:40 AM

2 votes

1 answers

2747 views

PVS output not showing drive

linux lvm storage cluster san

I had created physical volume earlier i.e. /backup, /ndasdb, /processor, /latro. Don't know what exactly happened, but /ndasdb is not showing in any of this output result: Pvdisplay, Vgdisplay, and lvdisplay. rest 3 are showing. I can see the lun in multipath -ll (ndasdb is there) I tried to recreat...

                                  I had created physical volume earlier i.e. /backup, /ndasdb, /processor, /latro.

Don't know what exactly happened, but /ndasdb is not showing in any of this output result: 

Pvdisplay, Vgdisplay, and lvdisplay. rest   3 are showing. 

I can see the lun in multipath -ll  (ndasdb is there) 

I tried to recreate it with pvcreate /dev/mapper/ndasdb but its gives me this error:

    WARNING: gfs2 signature detected on /dev/mapper/ndasdb at offset 65536. Wipe it? [y/n]: n
      Aborted wiping of gfs2.
      1 existing signature left on the device.

Is it true if I do yes it will delete my all data?
Is there anyway to bring my drive back without loosing any data?

Himanshu Dua (21 rep)

Dec 31, 2018, 08:38 PM • Last activity: Mar 22, 2024, 11:01 PM

1 votes

1 answers

119 views

qsub-like behavior for a slurm cluster

cluster slurm qsub

I recently switched to slurm and looking for a job submission tool, that behaves similar to qsub: 1. It takes input through a pipe 2. It prints the output to stdout Example: for n in `seq 1 10`; do echo "echo $n" | qsub done should send each echo command to a cluster and the output should be 1..10 p...

                                  I recently switched to slurm and looking for a job submission tool, that behaves similar to qsub: 

1. It takes input through a pipe
2. It prints the output to stdout 

Example: 

    for n in seq 1 10; do 
        echo "echo $n" | qsub
    done

should send each echo command to a cluster and the output should be 1..10 presumably in random order. 

So far I can 
1. send jobs with sbatch in parallel, but not sure to get the output to stdout
2. send jobs with srun, but then it operates sequentially one by one

Any suggestions?
                                

LazyCat (188 rep)

Mar 6, 2024, 02:01 AM • Last activity: Mar 8, 2024, 04:04 PM

0 votes

2 answers

524 views

Unable to install Slurm on PC

cluster slurm

I am trying to install slurm on Ubuntu PC. Therefore, I followed the instructions given over [here][1] I did the following - 1. `sudo apt update -y` 2. `sudo apt install slurmd slurmctld -y` 3. `mkdir sudo /etc/slurm-llnl` FYI... I came up with step 3. by myself 4. `sudo chmod 777 /etc/slurm-llnl` `...

I am trying to install slurm on Ubuntu PC. Therefore, I followed the instructions given over here I did the following - 1. sudo apt update -y 2. sudo apt install slurmd slurmctld -y 3. mkdir sudo /etc/slurm-llnl FYI... I came up with step 3. by myself 4. sudo chmod 777 /etc/slurm-llnl

sudo cat  /etc/slurm-llnl/slurm.conf
ClusterName=localcluster
SlurmctldHost=localhost
MpiDefault=none
ProctrackType=proctrack/linuxproc
ReturnToService=2
SlurmctldPidFile=/var/run/slurmctld.pid
SlurmctldPort=6817
SlurmdPidFile=/var/run/slurmd.pid
SlurmdPort=6818
SlurmdSpoolDir=/var/lib/slurm-llnl/slurmd
SlurmUser=slurm
StateSaveLocation=/var/lib/slurm-llnl/slurmctld
SwitchType=switch/none
TaskPlugin=task/none
#
# TIMERS
InactiveLimit=0
KillWait=30
MinJobAge=300
SlurmctldTimeout=120
SlurmdTimeout=300
Waittime=0
# SCHEDULING
SchedulerType=sched/backfill
SelectType=select/cons_tres
SelectTypeParameters=CR_Core
#
#AccountingStoragePort=
AccountingStorageType=accounting_storage/none
JobCompType=jobcomp/none
JobAcctGatherFrequency=30
JobAcctGatherType=jobacct_gather/none
SlurmctldDebug=info
SlurmctldLogFile=/var/log/slurm-llnl/slurmctld.log
SlurmdDebug=info
SlurmdLogFile=/var/log/slurm-llnl/slurmd.log
#
# COMPUTE NODES
NodeName=localhost CPUs=12 RealMemory=8000 State=UNKNOWN
PartitionName=LocalQ Nodes=ALL Default=YES MaxTime=INFINITE State=UP
EOF

6. sudo systemctl start slurmctld 7. sudo systemctl start slurmd Now, when I do this - 8. sudo scontrol update nodename=localhost state=idle I get the error -

scontrol: error: resolve_ctls_from_dns_srv: res_nsearch error: Unknown host
scontrol: error: fetch_config: DNS SRV lookup failed
scontrol: error: _establish_config_source: failed to fetch config
scontrol: fatal: Could not establish a configuration source

**Edit 1** - I followed the instructions given by Pau. Now, I get the following outputs -

(base) thoma@thoma-Lenovo-Legion-5-15IMH05H:/$ systemctl status slurmctld
● slurmctld.service - Slurm controller daemon
     Loaded: loaded (/lib/systemd/system/slurmctld.service; enabled; vendor preset: enabled)
     Active: active (running) since Tue 2024-03-05 05:57:17 CST; 2h 42min ago
       Docs: man:slurmctld(8)
   Main PID: 6509 (slurmctld)
      Tasks: 10
     Memory: 4.3M
        CPU: 2.378s
     CGroup: /system.slice/slurmctld.service
             ├─6509 /usr/sbin/slurmctld -D -s
             └─6517 "slurmctld: slurmscriptd" "" ""

Mar 05 05:58:27 thoma-Lenovo-Legion-5-15IMH05H slurmctld: slurmctld: Invalid node state transition requested for node localhost from=INVAL to=IDLE
Mar 05 05:58:27 thoma-Lenovo-Legion-5-15IMH05H slurmctld: slurmctld: _slurm_rpc_update_node for localhost: Invalid node state specified
Mar 05 06:00:07 thoma-Lenovo-Legion-5-15IMH05H slurmctld: slurmctld: Invalid node state transition requested for node localhost from=INVAL to=IDLE
Mar 05 06:00:07 thoma-Lenovo-Legion-5-15IMH05H slurmctld: slurmctld: _slurm_rpc_update_node for localhost: Invalid node state specified
Mar 05 06:01:30 thoma-Lenovo-Legion-5-15IMH05H slurmctld: slurmctld: Invalid node state transition requested for node localhost from=INVAL to=RESUME
Mar 05 06:01:30 thoma-Lenovo-Legion-5-15IMH05H slurmctld: slurmctld: _slurm_rpc_update_node for localhost: Invalid node state specified
Mar 05 06:02:13 thoma-Lenovo-Legion-5-15IMH05H slurmctld: slurmctld: Invalid node state transition requested for node localhost from=INVAL to=RESUME
Mar 05 06:02:13 thoma-Lenovo-Legion-5-15IMH05H slurmctld: slurmctld: _slurm_rpc_update_node for localhost: Invalid node state specified
Mar 05 06:02:20 thoma-Lenovo-Legion-5-15IMH05H slurmctld: slurmctld: Invalid node state transition requested for node localhost from=INVAL to=IDLE
Mar 05 06:02:20 thoma-Lenovo-Legion-5-15IMH05H slurmctld: slurmctld: _slurm_rpc_update_node for localhost: Invalid node state specified
(base) thoma@thoma-Lenovo-Legion-5-15IMH05H:/$ systemctl status slurmd
● slurmd.service - Slurm node daemon
     Loaded: loaded (/lib/systemd/system/slurmd.service; enabled; vendor preset: enabled)
     Active: active (running) since Tue 2024-03-05 05:57:17 CST; 2h 42min ago
       Docs: man:slurmd(8)
   Main PID: 6514 (slurmd)
      Tasks: 1
     Memory: 316.0K
        CPU: 22ms
     CGroup: /system.slice/slurmd.service
             └─6514 /usr/sbin/slurmd -D -s

Mar 05 05:57:17 thoma-Lenovo-Legion-5-15IMH05H systemd[1] : Started Slurm node daemon.
Mar 05 05:57:17 thoma-Lenovo-Legion-5-15IMH05H slurmd: slurmd: error: Node configuration differs from hardware: CPUs=12:12(hw) Boards=1:1(hw) SocketsPerBoard=12:1(hw) CoresPerSocket=1:6(hw) ThreadsPerCore>
Mar 05 05:57:17 thoma-Lenovo-Legion-5-15IMH05H slurmd: slurmd: slurmd version 21.08.5 started
Mar 05 05:57:17 thoma-Lenovo-Legion-5-15IMH05H slurmd: slurmd: slurmd started on Tue, 05 Mar 2024 05:57:17 -0600
Mar 05 05:57:17 thoma-Lenovo-Legion-5-15IMH05H slurmd: slurmd: CPUs=12 Boards=1 Sockets=12 Cores=1 Threads=1 Memory=7838 TmpDisk=1252975 Uptime=372 CPUSpecList=(null) FeaturesAvail=(null) FeaturesActive=(>
lines 1-16/16 (END)

desert_ranger (103 rep)

Mar 5, 2024, 01:51 AM • Last activity: Mar 5, 2024, 05:14 PM

1 votes

0 answers

54 views

Shell script looking for a missing module

shell cluster module

I want to run a shell script on a compute cluster but I get an error because at some point it is looking for a module that does not exist since a major update on the cluster a few months ago. This module is not loaded in my script, therefore my script is not the direct cause of the problem. One hypothesis is that the lmod cache is out of date, but I have no idea where is this cache. Or a file is sourced in which the module in question is loaded "intel/2018a". Here is the complete message of the error:

> # User specific environment and startup programs

> PATH=$PATH:$HOME/.local/bin:$HOME/bin
> + PATH=/node/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/cluster/bin:/cluster/home/sbarthelemy/.local/bin:/cluster/home/sbarthelemy/bin:/cluster/home/sbarthelemy/.local/bin:/cluster/home/sbarthelemy/bin

> export PATH
> + export PATH
> # NIRD settings 
> if [ uname -n | head -3c == 'tos' ]
> then
>  # intel compiler
>  source /opt/intel/compilers_and_libraries/linux/bin/compilervars.sh -arch intel64 -platform linux

>  # NCL 
>  export NCARG_ROOT=/opt/ncl64
>  export PATH=/opt/ncl64/bin/:${PATH}
> fi
> ++ uname -n
> ++ head -3c
> + '[' log == tos ']'

> module --force purge
> + module --force purge
> + '[' -z '' ']'
> + case "$-" in
> + __lmod_sh_dbg=vx
> + '[' -n vx ']'
> + set +vx
> Shell debugging temporarily silenced: export LMOD_SH_DBG_ON=1 for Lmod's output
> Shell debugging restarted
> + unset __lmod_sh_dbg
> + return 0
> module load StdEnv
> + module load StdEnv
> + '[' -z '' ']'
> + case "$-" in
> + __lmod_sh_dbg=vx
> + '[' -n vx ']'
> + set +vx
> Shell debugging temporarily silenced: export LMOD_SH_DBG_ON=1 for Lmod's output
> Shell debugging restarted
> + unset __lmod_sh_dbg
> + return 0
> module load intel/2018a
> + module load intel/2018a
> + '[' -z '' ']'
> + case "$-" in
> + __lmod_sh_dbg=vx
> + '[' -n vx ']'
> + set +vx
> Shell debugging temporarily silenced: export LMOD_SH_DBG_ON=1 for Lmod's output
Lmod has detected the following error:  The following module(s) are unknown: "intel/2018a"

> Please check the spelling or version number. Also try "module spider ..."
> It is also possible your cache file is out-of-date; it may help to try:
>   $ module --ignore_cache load "intel/2018a"

Seb (11 rep)

Feb 5, 2024, 10:53 AM • Last activity: Feb 5, 2024, 11:53 AM

0 votes

1 answers

52 views

Running arbitrary binary program with cluster computers

networking memory disk process-management cluster

I have 3 VPS. Let's say master, slave1, slave2. Their specifications are identic. * Processor: 1CPU * Memory: 1GB * Disk: 10GB * Network: running on LAN each other I expect any arbitrary binary program (process) that run on master VPS is seeing as one VPS. So that's mean the workload of master VPS w...

                                  I have 3 VPS. Let's say master, slave1, slave2.

Their specifications are identic.
* Processor: 1CPU
* Memory: 1GB
* Disk: 10GB
* Network: running on LAN each other

I expect any arbitrary binary program (process) that run on master VPS is seeing as one VPS. So that's mean the workload of master VPS will be shared to its slaves through network.

So what the program will see is, it's running on a computer with specification:
* Processor: 3CPU
* Memory: 3GB
* Disk: 30GB

The question is, any there protocol or service that can combine computing power like that?

To combining memory (RAM) or storage (SSD).
I was thinking to mount ramfs into a mountopoint for every slave. Then using NFS server for every slaves and make it available so that NFS client from master can mount it into local. In master, combining mountpoints into single mountpoint/mnt/shm_from_slave1 and /mnt/shm_from_slave2 into single mountpoint /mnt/shm_combined.
Then in the combined shared memory in master, I will create swapfile /mnt/shm_combined/swapfile. Then configuring the master OS to make swapfile to use frequently. Note that, this is my plan to combine the memory. To combine processor power, I have no idea.
                                

Muhammad Ikhwan Perwira (319 rep)

Nov 2, 2023, 07:12 PM • Last activity: Nov 2, 2023, 09:33 PM

6 votes

1 answers

11261 views

About mem and vmem

memory virtual-memory cluster

I am working with a cluster machine running under linux. I have a shell script that uses `mpirun` to submit my jobs to the cluster machine. In that same script, I can choose the number of nodes that will be assigned to the job. So far, so good. My issue arises after: when I submit a few jobs, all wo...

                                  I am working with a cluster machine running under linux.

I have a shell script that uses mpirun to submit my jobs to the cluster machine. In that same script, I can choose the number of nodes that will be assigned to the job. So far, so good.

My issue arises after: when I submit a few jobs, all works well, however, when I fill the capacity of the nodes, some of the submitted jobs won't be completed. I am consequently suspecting that the available memory on the cluster is not sufficient to deal with all of my jobs at the same time.

This is why I want to check the memory usage of each job over time, I then use the qstat -f command, but it displays a lot of things, and most of them I cannot understand.

**So here is my question:** In the sample output of the qstat -f command below, we can see two types of memory: mem and vmem. I would like to know what is the difference between these two and what is the real amount of memory used?

    resources_used.cput = 00:21:04
    resources_used.mem = 2099860kb
    resources_used.vmem = 40505676kb
    resources_used.walltime = 00:21:08

Additionally, I would appreciate any reference where the output of this command is detailed. I tried man qstat but it doesn't go into the details of each returned line.

Mary (61 rep)

Nov 14, 2014, 03:18 AM • Last activity: Nov 2, 2023, 04:16 PM

0 votes

2 answers

2897 views

RHEL High-Availability Cluster using pcs, configuring service as a resource

rhel cluster pacemaker high-availability pcs

I have a 2 node cluster on RHEL 6.9. Everything is configured except I'm having difficulty with an application launched via shell script that created into a service (in `/etc/init.d/myApplication`), which I'll just call "myApp" . From that application, I did a `pcs resource create myApp lsb:myApp op...

                                  I have a 2 node cluster on RHEL 6.9. Everything is configured except I'm having difficulty with an application launched via shell script that created into a service (in /etc/init.d/myApplication), which I'll just call "myApp". From that application, I did a pcs resource create myApp lsb:myApp op monitor interval=30s op start on-fail=standby. I am new to using this suite of software but it's for work. What I need is for this application to be launched on both nodes simultaneously as it has to be started manually so if the first node fails, it would need intervention if it were not already active on the passive node. 

I have two other services:  
-VirtIP (ocf:heartbeat:IPaddr2) for providing a service IP for the application server  
-Cron (lsb:crond) to synchronize the application files (we are not using shared storage)

I have the VirtIP and Cron as dependents via colocation to myApp.

I've tried master/slave as well as cloning but I must be missing something regarding their config. If I take the application offline, pacemaker does not detect the service has gone down and pcs status outputs that myApp is still running on the node (or nodes depending on my config). I'm also sometimes getting the issue that the service running the app is stopped by pacemaker on the passive node. 

Which is the way I need to configure this? I've gone through the RHEL documentation but I'm still stuck. How do I get pacemaker to initiate failover if myApp service goes down? I don't know why it's not detecting the service has stopped in some cases.

EDIT: So for testing purposes, I removed the password requirement for starting/restarting and the service starts/restarts fine as expected and the colocation dependent resources stop/start as expected. But stopping the myApp service does not reflect as a stopped resource but simply stays at Started node1. Likewise, simulating a failover via putting node1 into standby simply stops all resources on node1.

Greg (187 rep)

Sep 29, 2017, 07:52 AM • Last activity: Sep 6, 2023, 09:56 PM

1 votes

1 answers

160 views

Can I fully utilize HDR Infiniband network throughput between servers and NFS volume?

linux nfs cluster infiniband

I'm working on a project building a CPU cluster, and those servers and NFS storage (not a parallel file system) are going to be connected through HDR InfiniBand cables. In this architecture, can I get proper storage I/O performance through the InfiniBand network, and does NFS support InfiniBand comm...

                                  I'm working on a project building a CPU cluster, and those servers and NFS storage (not a parallel file system) are going to be connected through HDR InfiniBand cables. In this architecture, can I get proper storage I/O performance through the InfiniBand network, and does NFS support InfiniBand communication? Or should I build a 200G Ethernet network (not an IB network) fabric to write and read from storage? If possible, are there any things I should configure?
                                

Antenna_ (35 rep)

Aug 17, 2023, 10:20 AM • Last activity: Aug 17, 2023, 03:12 PM

1 votes

1 answers

51 views

Unable to run linpack on head node of cluster

linux raspberry-pi raspbian cluster

I recently set up my own home cluster - 4 units of raspberry pi. But I am having problems trying to benchmark all 4 units using Linpack One node is the head node called rpislave1, it connects to the Internet and my local wifi network using the wlan0 interface while it uses the eth0 on it to connect to the internal LAN that is the cluster. The other 3 nodes are rpislave2,rpislave3 and rpislave4. Each are connected to the head node - rpislave1 and get their Internet access through rpislave1. To make things simple,these 3 nodes network boot off a flash drive connected to rpislave1. All units have been allocated their own IP address using their mac address through dhcp. Here is the /etc/hosts file for the head node

127.0.0.1       localhost
::1             localhost ip6-localhost ip6-loopback
ff02::1         ip6-allnodes
ff02::2         ip6-allrouters

127.0.1.1 cluster

192.168.50.1    rpislave1 cluster
192.168.50.11   rpislave2
192.168.50.12   rpislave3
192.168.50.13   rpislave4

All units can be accessed via ssh without passwords from rpislave1 and share a NFS drive at /sharezone - which is connected to a thumbdrive mounted on rpislave1. I am pretty happy with the learning experience and decided to benchmark the total processing of the cluster -rpislave1, rpislave2, rpislave3, and rpislave4 using HPL or linpack https://www.netlib.org/benchmark/hpl/ I started off installing OpenMPI on the head node -rpislave1. and it worked on its own clocking at 15 GFlops - nothing to boast about of course but it was fun. I then proceed to set up linpack and openmpi on rpislave2 and did a standalone test and so on with the remaining units - rpislave3 and rpislave4. So I decided to try and run it across 2 nodes - rpislave1 and rpislave2. Here is the HPL.dat I am using for 2 nodes but I don't think the issue is the HPL.dat I am using.

HPLinpack benchmark input file
Innovative Computing Laboratory, University of Tennessee
HPL.out      output file name (if any) 
6            device out (6=stdout,7=stderr,file)
1            # of problems sizes (N)
40704         Ns
1            # of NBs
192           NBs
0            PMAP process mapping (0=Row-,1=Column-major)
1            # of process grids (P x Q)
2            Ps
4            Qs
16.0         threshold
1            # of panel fact
2            PFACTs (0=left, 1=Crout, 2=Right)
1            # of recursive stopping criterium
4            NBMINs (>= 1)
1            # of panels in recursion
2            NDIVs
1            # of recursive panel fact.
1            RFACTs (0=left, 1=Crout, 2=Right)
1            # of broadcast
1            BCASTs (0=1rg,1=1rM,2=2rg,3=2rM,4=Lng,5=LnM)
1            # of lookahead depth
1            DEPTHs (>=0)
2            SWAP (0=bin-exch,1=long,2=mix)
64           swapping threshold
0            L1 in (0=transposed,1=no-transposed) form
0            U  in (0=transposed,1=no-transposed) form
1            Equilibration (0=no,1=yes)
8            memory alignment in double (> 0)
##### This line (no. 32) is ignored (it serves as a separator). ######
0                               Number of additional problem sizes for PTRANS
1200 10000 30000                values of N
0                               number of additional blocking sizes for PTRANS
40 9 8 13 13 20 16 32 64        values of NB

Even make a host file to use with it

user@rpislave1:/sharezone/hpl $ cat host_file
rpislave1 slots=4
rpislave2 slots=4

Here is the command I used:

time mpirun -hostfile host_file -np 8 /sharezone/xhpl/bin/xhpl

But the output I got was this

user@rpislave1:/sharezone/hpl $ time mpirun -hostfile host_file -np 8 /sharezone/xhpl/bin/xhpl
================================================================================
HPLinpack 2.3  --  High-Performance Linpack benchmark  --   December 2, 2018
Written by A. Petitet and R. Clint Whaley,  Innovative Computing Laboratory, UTK
Modified by Piotr Luszczek, Innovative Computing Laboratory, UTK
Modified by Julien Langou, University of Colorado Denver
================================================================================

An explanation of the input/output parameters follows:
T/V    : Wall time / encoded variant.
N      : The order of the coefficient matrix A.
NB     : The partitioning blocking factor.
P      : The number of process rows.
Q      : The number of process columns.
Time   : Time in seconds to solve the linear system.
Gflops : Rate of execution for solving the linear system.

The following parameter values will be used:

N      :   40704
NB     :     192
PMAP   : Row-major process mapping
P      :       2
Q      :       4
PFACT  :   Right
NBMIN  :       4
NDIV   :       2
RFACT  :   Crout
BCAST  :  1ringM
DEPTH  :       1
SWAP   : Mix (threshold = 64)
L1     : transposed form
U      : transposed form
EQUIL  : yes
ALIGN  : 8 double precision words

--------------------------------------------------------------------------------

- The matrix A is randomly generated for each test.
- The following scaled residual check will be computed:
      ||Ax-b||_oo / ( eps * ( || x ||_oo * || A ||_oo + || b ||_oo ) * N )
- The relative machine precision (eps) is taken to be               1.110223e-16
- Computational tests pass if scaled residuals are less than                16.0

--------------------------------------------------------------------------
Open MPI detected an inbound MPI TCP connection request from a peer
that appears to be part of this MPI job (i.e., it identified itself as
part of this Open MPI job), but it is from an IP address that is
unexpected.  This is highly unusual.

The inbound connection has been dropped, and the peer should simply
try again with a different IP interface (i.e., the job should
hopefully be able to continue).

  Local host:          rpislave2
  Local PID:           1574
  Peer hostname:       rpislave1 ([[58941,1],2])
  Source IP of socket: 192.168.50.1
  Known IPs of peer:
        169.254.131.47
--------------------------------------------------------------------------

I have no idea what is causing this issue but I have noticed if run the linpack test on rpislave2, rpislave3 or rpislave4 or any sort of combination of 2, it would work without issue. It is as it I cannot run on the head node rpislave1. I have been looking around for days trying all sorts of steps, I suspect the open MPI is accessing the wlan0 I have on the head node to connect to the local wifi network so I tried to use "--mca btl_tcp_if_exclude wlan0" or any sort of mca option but nothing worked. I even went through the github issues but all seem to have been fixed and I should have the latest patches. Here is the openmpi versions I have

user@rpislave1:/sharezone/hpl $ sudo apt install openmpi-bin openmpi-common libopenmpi3 libopenmpi-dev
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
libopenmpi-dev is already the newest version (4.1.0-10).
libopenmpi3 is already the newest version (4.1.0-10).
openmpi-bin is already the newest version (4.1.0-10).
openmpi-common is already the newest version (4.1.0-10).
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
user@rpislave1:/sharezone/hpl $

Does anyone have any idea what is causing the "Open MPI detected an inbound MPI TCP connection request from a peer that appears to be part of this MPI job " error? I suspect it maybe related to the wlan0 interface since it shows this

Known IPs of peer:
        169.254.131.47

a traceroute shows this result

user@rpislave1:/sharezone/hpl $ traceroute 169.254.131.47
traceroute to 169.254.131.47 (169.254.131.47), 30 hops max, 60 byte packets
 1  rpislave1.local (169.254.131.47)  0.192 ms  0.107 ms  0.096 ms
user@rpislave1:/sharezone/hpl $

Here is the ifconfig for rpislave1/head node

user@rpislave1:/sharezone/hpl $ ifconfig
eth0: flags=4163  mtu 1500
        inet 192.168.50.1  netmask 255.255.255.0  broadcast 192.168.50.255
        inet6 fe80::d314:681c:2e82:d5bc  prefixlen 64  scopeid 0x20
        ether d8:3a:dd:1d:92:15  txqueuelen 1000  (Ethernet)
        RX packets 962575  bytes 911745808 (869.5 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 590397  bytes 382892062 (365.1 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 3831  bytes 488990 (477.5 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 3831  bytes 488990 (477.5 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

wlan0: flags=4163  mtu 1500
        inet 192.168.101.15  netmask 255.255.255.0  broadcast 192.168.101.255
        inet6 2001:f40:950:b164:806e:1571:b836:23a4  prefixlen 64  scopeid 0x0
        inet6 fe80::1636:9990:bd05:dd05  prefixlen 64  scopeid 0x20
        ether d8:3a:dd:1d:92:16  txqueuelen 1000  (Ethernet)
        RX packets 44632  bytes 12764596 (12.1 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 74151  bytes 13143926 (12.5 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

user@rpislave1:/sharezone/hpl $

I would really appreciate any help on solving this issue.

AlexChan (21 rep)

Jul 26, 2023, 10:37 AM • Last activity: Jul 28, 2023, 09:41 AM

Showing page 1 of 20 total questions