Very new to slurm. How to get slurm to run multiple core jobs on my linux cluster?
2
votes
1
answer
364
views
I've been trying to move some existing processes to a revamped linux cluster that now runs on slurm. I thought I have it done, but my problem now is trying to get multiple cores to run.
Here is my submission script.
node3-5/4 Here is the first few lines of my output that shows only 1 core is running:
#!/bin/bash
#
#SBATCH --job-name=test_mpi
#SBATCH --output=res_mpi.txt
#
#SBATCH -n 4
#SBATCH --time=10:00
srun mkdir -p /tmp/tedhyu/new
srun cp Ru13.in /tmp/tedhyu/new/lcao.in
srun cp ~tedhyu/atom_pbe/* /tmp/tedhyu/new
srun cd /tmp/tedhyu/new
srun -N 1 -n 4 --chdir=/tmp/tedhyu/new mpiexec ~tedhyu/bin/origin1_centos6.4_mpich2_quest_265c.x
When I "qstat -n" it only shows one core:
Job id Username Queue Name SessID NDS TSK Memory Time Use S Time
-------------------- -------- -------- -------------------- ------ ----- ----- ------ ----- - -----
11778 tedhyu atom test_mpi -- 1 4 -- 00:10 C 00:00node3-5/4 Here is the first few lines of my output that shows only 1 core is running:
srun: error: node3-5: tasks 0-3: Exited with exit code 1
MPINFO::: Global Communicator :::
MPINFO::: Global Context = **** :::
MPINFO::: Global Size = 1 :::
MPINFO::: Global Root = 0 :::
MPINFO::: Global Rank = 0 :::
DEV: VDW development version
Global Size should equal 4
If anyone can point me in the right direction... Thanks!!!
Asked by ted y
(21 rep)
Jan 24, 2021, 05:10 AM
Last activity: Feb 14, 2021, 10:06 AM
Last activity: Feb 14, 2021, 10:06 AM