I am trying to install slurm on Ubuntu PC. Therefore, I followed the instructions given over here
I did the following -
1.
sudo apt update -y
2. sudo apt install slurmd slurmctld -y
3. mkdir sudo /etc/slurm-llnl
FYI... I came up with step 3. by myself
4. sudo chmod 777 /etc/slurm-llnl
sudo cat /etc/slurm-llnl/slurm.conf
ClusterName=localcluster
SlurmctldHost=localhost
MpiDefault=none
ProctrackType=proctrack/linuxproc
ReturnToService=2
SlurmctldPidFile=/var/run/slurmctld.pid
SlurmctldPort=6817
SlurmdPidFile=/var/run/slurmd.pid
SlurmdPort=6818
SlurmdSpoolDir=/var/lib/slurm-llnl/slurmd
SlurmUser=slurm
StateSaveLocation=/var/lib/slurm-llnl/slurmctld
SwitchType=switch/none
TaskPlugin=task/none
#
# TIMERS
InactiveLimit=0
KillWait=30
MinJobAge=300
SlurmctldTimeout=120
SlurmdTimeout=300
Waittime=0
# SCHEDULING
SchedulerType=sched/backfill
SelectType=select/cons_tres
SelectTypeParameters=CR_Core
#
#AccountingStoragePort=
AccountingStorageType=accounting_storage/none
JobCompType=jobcomp/none
JobAcctGatherFrequency=30
JobAcctGatherType=jobacct_gather/none
SlurmctldDebug=info
SlurmctldLogFile=/var/log/slurm-llnl/slurmctld.log
SlurmdDebug=info
SlurmdLogFile=/var/log/slurm-llnl/slurmd.log
#
# COMPUTE NODES
NodeName=localhost CPUs=12 RealMemory=8000 State=UNKNOWN
PartitionName=LocalQ Nodes=ALL Default=YES MaxTime=INFINITE State=UP
EOF
6. sudo systemctl start slurmctld
7. sudo systemctl start slurmd
Now, when I do this -
8. sudo scontrol update nodename=localhost state=idle
I get the error -
scontrol: error: resolve_ctls_from_dns_srv: res_nsearch error: Unknown host
scontrol: error: fetch_config: DNS SRV lookup failed
scontrol: error: _establish_config_source: failed to fetch config
scontrol: fatal: Could not establish a configuration source
**Edit 1** -
I followed the instructions given by Pau. Now, I get the following outputs -
(base) thoma@thoma-Lenovo-Legion-5-15IMH05H:/$ systemctl status slurmctld
● slurmctld.service - Slurm controller daemon
Loaded: loaded (/lib/systemd/system/slurmctld.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2024-03-05 05:57:17 CST; 2h 42min ago
Docs: man:slurmctld(8)
Main PID: 6509 (slurmctld)
Tasks: 10
Memory: 4.3M
CPU: 2.378s
CGroup: /system.slice/slurmctld.service
├─6509 /usr/sbin/slurmctld -D -s
└─6517 "slurmctld: slurmscriptd" "" ""
Mar 05 05:58:27 thoma-Lenovo-Legion-5-15IMH05H slurmctld: slurmctld: Invalid node state transition requested for node localhost from=INVAL to=IDLE
Mar 05 05:58:27 thoma-Lenovo-Legion-5-15IMH05H slurmctld: slurmctld: _slurm_rpc_update_node for localhost: Invalid node state specified
Mar 05 06:00:07 thoma-Lenovo-Legion-5-15IMH05H slurmctld: slurmctld: Invalid node state transition requested for node localhost from=INVAL to=IDLE
Mar 05 06:00:07 thoma-Lenovo-Legion-5-15IMH05H slurmctld: slurmctld: _slurm_rpc_update_node for localhost: Invalid node state specified
Mar 05 06:01:30 thoma-Lenovo-Legion-5-15IMH05H slurmctld: slurmctld: Invalid node state transition requested for node localhost from=INVAL to=RESUME
Mar 05 06:01:30 thoma-Lenovo-Legion-5-15IMH05H slurmctld: slurmctld: _slurm_rpc_update_node for localhost: Invalid node state specified
Mar 05 06:02:13 thoma-Lenovo-Legion-5-15IMH05H slurmctld: slurmctld: Invalid node state transition requested for node localhost from=INVAL to=RESUME
Mar 05 06:02:13 thoma-Lenovo-Legion-5-15IMH05H slurmctld: slurmctld: _slurm_rpc_update_node for localhost: Invalid node state specified
Mar 05 06:02:20 thoma-Lenovo-Legion-5-15IMH05H slurmctld: slurmctld: Invalid node state transition requested for node localhost from=INVAL to=IDLE
Mar 05 06:02:20 thoma-Lenovo-Legion-5-15IMH05H slurmctld: slurmctld: _slurm_rpc_update_node for localhost: Invalid node state specified
(base) thoma@thoma-Lenovo-Legion-5-15IMH05H:/$ systemctl status slurmd
● slurmd.service - Slurm node daemon
Loaded: loaded (/lib/systemd/system/slurmd.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2024-03-05 05:57:17 CST; 2h 42min ago
Docs: man:slurmd(8)
Main PID: 6514 (slurmd)
Tasks: 1
Memory: 316.0K
CPU: 22ms
CGroup: /system.slice/slurmd.service
└─6514 /usr/sbin/slurmd -D -s
Mar 05 05:57:17 thoma-Lenovo-Legion-5-15IMH05H systemd[1] : Started Slurm node daemon.
Mar 05 05:57:17 thoma-Lenovo-Legion-5-15IMH05H slurmd: slurmd: error: Node configuration differs from hardware: CPUs=12:12(hw) Boards=1:1(hw) SocketsPerBoard=12:1(hw) CoresPerSocket=1:6(hw) ThreadsPerCore>
Mar 05 05:57:17 thoma-Lenovo-Legion-5-15IMH05H slurmd: slurmd: slurmd version 21.08.5 started
Mar 05 05:57:17 thoma-Lenovo-Legion-5-15IMH05H slurmd: slurmd: slurmd started on Tue, 05 Mar 2024 05:57:17 -0600
Mar 05 05:57:17 thoma-Lenovo-Legion-5-15IMH05H slurmd: slurmd: CPUs=12 Boards=1 Sockets=12 Cores=1 Threads=1 Memory=7838 TmpDisk=1252975 Uptime=372 CPUSpecList=(null) FeaturesAvail=(null) FeaturesActive=(>
lines 1-16/16 (END)
Asked by desert_ranger
(103 rep)
Mar 5, 2024, 01:51 AM
Last activity: Mar 5, 2024, 05:14 PM
Last activity: Mar 5, 2024, 05:14 PM