Sample Header Ad - 728x90

Very new to slurm. How to get slurm to run multiple core jobs on my linux cluster?

2 votes
1 answer
364 views
I've been trying to move some existing processes to a revamped linux cluster that now runs on slurm. I thought I have it done, but my problem now is trying to get multiple cores to run. Here is my submission script.
#!/bin/bash
   #
   #SBATCH --job-name=test_mpi
   #SBATCH --output=res_mpi.txt
   #
   #SBATCH -n 4
   #SBATCH --time=10:00
   srun mkdir -p /tmp/tedhyu/new
  srun cp Ru13.in /tmp/tedhyu/new/lcao.in
  srun cp ~tedhyu/atom_pbe/* /tmp/tedhyu/new
  srun cd /tmp/tedhyu/new
  srun -N 1  -n 4 --chdir=/tmp/tedhyu/new  mpiexec ~tedhyu/bin/origin1_centos6.4_mpich2_quest_265c.x
When I "qstat -n" it only shows one core: Job id Username Queue Name SessID NDS TSK Memory Time Use S Time -------------------- -------- -------- -------------------- ------ ----- ----- ------ ----- - ----- 11778 tedhyu atom test_mpi -- 1 4 -- 00:10 C 00:00
node3-5/4 Here is the first few lines of my output that shows only 1 core is running:
srun: error: node3-5: tasks 0-3: Exited with exit code 1
     MPINFO::: Global Communicator        :::
     MPINFO::: Global Context = ****      :::
     MPINFO::: Global Size =       1      :::
     MPINFO::: Global Root =       0      :::
     MPINFO::: Global Rank =       0      :::
     DEV: VDW development version
Global Size should equal 4 If anyone can point me in the right direction... Thanks!!!
Asked by ted y (21 rep)
Jan 24, 2021, 05:10 AM
Last activity: Feb 14, 2021, 10:06 AM