meta data for this page
This is an old revision of the document!
Job Scheduling with SLURM
In addition to the shared disc space there is also shared computation power in the form of a computer cluster, meaning the connection of many computers to allow parallel and distributed processing. Simply, this allows for the speedup of calculation by spreading the compute load to multiple computers.
The use of the cluster is simple and it is controlled by a queuing system, in our case SLURM. A user simply writes a short script specifying the programs, parameters and files together with a control script, and passes this information to the queuing system, which in turn will distribute the task on the cluster, if there are enough free slots. If too many people are using the cluster at the same time, some tasks will remain in the queue until the next slot becomes available. In the following you will find a quick glance on how to use SLURM. The following video clips (sorry for the poor quality) may help:
- How2Slurm1: Overview of the ApplBio Computer system
- How2Slurm2: Introducing SLURM and the concept of control scripts
- How2Slurm3: Control scripts in detail
Architecture
Slurm has a centralized manager, slurmctld, to monitor resources and work. There may also be a backup manager to assume those responsibilities in the event of failure. Each compute server (node) has a slurmd daemon, which can be compared to a remote shell: it waits for work, executes that work, returns status, and waits for more work. The slurmd daemons provide fault-tolerant hierarchical communications. There is an optional slurmdbd (Slurm DataBase Daemon) which can be used to record accounting information for multiple Slurm-managed clusters in a single database.
User tools include sbatch to initiate jobs, scancel to terminate queued or running jobs, sinfo to report system status, squeue to report the status of jobs, and sacct to get information about jobs and job steps that are running or have completed. The smap and sview commands graphically reports system and job status including network topology. There is an administrative tool scontrol available to monitor and/or modify configuration and state information on the cluster. The administrative tool used to manage the database is sacctmgr. It can be used to identify the clusters, valid users, valid bank accounts, etc. APIs are available for all functions.
Creating a script file
The script file tells SLURM what should be done. The example below serves as a general guideline. Please do not simply copy it, but instead try to understand what information is provided to SLURM, and then modify the information to match your requirements. This applies, in particular, to the options –cpu-per-task and –mem-per-cpu. If you set the values for these options too high, your job will either remain pending until the requested resources are free on the cluster, or alternatively, your job will run and will then block the requested resources although they will not be used.
Example
This example uses a simple seed file that provides input for the program calls, e.g. file names, parameter values and the like.
#!/bin/bash #SBATCH --partition=pool # specifies the queue you are submitting to #SBATCH --nodelist=pool18 # specifies the computer you want to use. # CAREFUL: Use this option only when necessary, # because SLURM will have to wait until the specified computer is free #SBATCH --account=intern # specifies your role in the system. Most likely, you will be praktikant #SBATCH --cpus-per-task=1 # how many cpus will you require per task (be reasonable here) #SBATCH --mem-per-cpu=1mb # how much RAM will you need (be reasonable here) #SBATCH --profile=task #SBATCH --job-name="Test" # self-explanatory #SBATCH --output=Test_%A_%a.o.out # specifies the name of the output file. Feel free to replace #the 'Test' with something more informative #SBATCH --error=Test_%A_%a.e.out # specifies the name of the error log file. Feel free to replace # the 'Test' with something more informative #SBATCH --array=1-8%4 # specifies how many jobs you are submitting (here 8). # The %4 specifies that never more than 4 jobs run in parallel echo This is task $SLURM_ARRAY_TASK_ID # executes the command "echo", which prints some information # about your job SEED=$(awk "FNR==$SLURM_ARRAY_TASK_ID" seedfile.txt) # reads in the seed file and will place each line # into the variable $SEED echo "Hello $SEED" # executes the command "echo", which will now print each line of # the seed file. Place here any command that you want to run on the # remote computer
You can now start using slurm by simply changing the program call to the task you want to perform. Find below a number of settings that are specific to the Applied Bioinformatics system.
- partition
- all (compute01-compute10)
- pool (pool00-pool22)
- inteli7 (compute11-compute14)
- wks (16 workstations)
- teaching (compute14-15)
- account
- intern
- praktikant
- cpus-per-task
- min: 1
- max: 64
- mem-per-cpu
- max: 1000gb Only compute17 is equipped with 1TB of RAM!
- array
- min: 1
- max: There is currently an upper limit of 500 jobs1) in an array that can be submitted.
Create a Seedfile
A seedfile is no more than a textfile which enlists the paths to the files you'd like to process with your job array separated by a new line. This is easy to accomplish if all your files are located in a single directory called let's say 'my_dir'
cd my_dir; ls -d $PWD/*> seedfile.txt
You can generate an example seedfile by copying the information below and pasting it into a file calles seedfile.txt.
ANOGA@7165@1.fa AQUAE@224324@1.fa ARATH@3702@1.fa ASPFU@330879@1.fa BACSU@224308@1.fa BACTN@226186@1.fa BATDJ@684364@1.fa BOVIN@9913@1.fa BRADU@224911@1.fa BRAFL@7739@1.fa CAEEL@6239@1.fa CANAL@237561@1.fa CANLF@9615@1.fa CHICK@9031@1.fa CHLAA@324602@1.fa
Job submission
Use the command sbatch to submit the job to SLURM
sbatch /path/to/script.slurm
Job status
Check Job and Task Status using the command squeue
squeue -u your_username
Note, if you omit the option -u your_username
then you will get information about all jobs currently managed via SLURM.
See this page, for example, for an overview of the job status abbreviations.
Cancle jobs
Use the command scancel to stop pending and running jobs
scancel -j jobid
Information about slurm settings on our system
Use the command sinfo to see the available queues/partitions and the assigned computers
sinfo
Common mistakes
This is a non-exhaustive list of issues and mistakes in the context of SLURM
- you specify a non-existing queue
- the number of cpus that you reserve per task is higher or lower than the number of cpus the task is actually using. It is a bad idea to reserve more cpus than needed, because you block resources that are needed by other people in the group!
- the amount of memory you reserve per task is higher or lower than the amount needed by your task. It is a bad idea to reserve more memory than needed, because you block resources that are needed by other people in the group!
- a program is not running because it is not installed on the remote server. Solution: Report to the tutor or the systems administrator
- your job is in pending state forever → it might be that non of the computers in the queue has the capacity2) that you specified