Installing SLURM in your machine¶
To install SLURM you need admin access to the machine. Please follow this instructions to start up running the workload manager, in the controller as well in the controlled machines.
sudo apt-get -y install slurm-wlm
sudo nano /etc/slurm-llnl/slurm.conf
sudo chown -R slurm:slurm /var/run/slurm-llnl/
sudo chown -R slurm:slurm /var/lib/slurm-llnl/
sudo chown -R slurm:slurm /var/log/slurm-llnl/
sudo mkdir /var/spool/slurmd
sudo chown -R slurm:slurm /var/spool/slurmd
sudo systemctl start slurmd
Replace $HOST_NAME
with your machine name that is going to act as the
controller. If you have multiple machines, this configuration file must be
identical and in all machines in the queue.
### slurm.conf - Slurm config file.
#ClusterName=$HOST_NAME
ControlMachine=$HOST_NAME
SlurmUser=slurm
AuthType=auth/munge
SlurmctldPidFile=/var/run/slurm-llnl/slurmctld.pid
SlurmdPidFile=/var/run/slurm-llnl/slurmd.pid
SlurmdSpoolDir=/var/lib/slurm-llnl/slurmd
StateSaveLocation=/var/lib/slurm-llnl/slurmctld
SwitchType=switch/none
ProctrackType=proctrack/pgid
TaskPlugin=task/none
MpiDefault=none
MaxJobCount=100000
MaxArraySize=64000
# TIMERS
SlurmdTimeout=300
InactiveLimit=0
MinJobAge=300
KillWait=30
Waittime=0
# SCHEDULING
SchedulerType=sched/backfill
SelectType=select/cons_res
SelectTypeParameters=CR_Core
FastSchedule=1
# LOGGING
SlurmctldDebug=3
SlurmctldLogFile=/var/log/slurm-llnl/slurmctld.log
SlurmdDebug=3
SlurmdLogFile=/var/log/slurm-llnl/slurmd.log
# COMPUTE NODES
# Here you add the machine hardware configurations
NodeName=$HOST_NAME Procs=8 Boards=1 SocketsPerBoard=1 CoresPerSocket=4 ThreadsPerCore=2 State=idle
# Here you add the machine(s) to a Partition
PartitionName=MyCluster Nodes=$HOST_NAME Default=yes MaxTime=INFINITE State=up
Note
Please refer to SLURM for advance configuration like limiting time, CPUs and RAM for users or groups, to balance load in your cluster.