In the following you can find a series of examples that show the cluster simplest uses. At the end of the file a full sequence of commands for a cluster job is given.

Each one of the examples is compressed in a ZIP archive. You can extract it in your labsrv7/labsrv8home-directory and it will create a directory Examples/XX-Name where:

Some of the example need to be compiled (C, Java, ecc), for those there is a Makefile. In a terminal change your directory to the example directory and use the make command. The JOB definition file is set to run from the compilation directory with sbatch command. The job definition file has suffix ‘.slurm‘.

Example

Connect to labsrv7 and unzip/compile/run an example with the following commands (for the 01-Sieve example) :

 [labsrv7]:~$ unzip 01-Sieve.zip 
 Archive:  Examples/Archives/01-Sieve.zip
    creating: Examples/01-Sieve/
   inflating: Examples/01-Sieve/Makefile  
   inflating: Examples/01-Sieve/sieve.c  
   inflating: Examples/01-Sieve/sieve.slurm
 [labsrv7]:~$ cd Examples/01-Sieve
 [labsrv7]:~/Examples/01-Sieve$ make
 cc -Wall -g -c sieve.c
 gcc -o sieve sieve.o 
 [labsrv7]:~/Examples/01-Sieve$ sbatch sieve.slurm
 Submitted batch job 132
 [labsrv7]:~/Examples/01-Sieve$  

A full sequence example
Here you will find a complete example with:

  1. General setup
  2. Build of a container cuda enabled
  3. Setup of a job that uses the container
  4. Run of the job

1 – General setup
Connect to labsrv7 or labsrv8 and create a new directory (Job0 in our example) for the task:

user@myhost:~$ ssh username@labsrv7.math.unipd.it
....
username@labsrv7:~$ mkdir Job0
username@labsrv7:~$ cd Job0
username@labsrv7:~/Job0$

2 – Build of a container cuda enabled
The first step is provide a singularity container definition. A text file with a defined syntax. You can find the complete documentation on the singularity site.
For this example we will use one of the basic containers provided in the common area of the cluster. With the following commands we will copy the definition file and launch the remote build of the container.

username@labsrv7:~/Job0$ cp /conf/shared-software/Singularity/CUDA/containers/cuda-11.2-ubuntu2004-tensorflow-pytorch/container.def .
username@labsrv7:~/Job0$ singularity-remote-build container.sif container.def container-build.log
Running with parameters:
 -  Container file: /home/koriel/Job0/container.sif
 - Definition file: /home/koriel/Job0/container.def
 -        Log file: /home/koriel/Job0/container-build.log
build started...
username@labsrv7:~/Job0$

Now you have to wait until your log file container-build.log become readable and the container file container.sif appears (if no error happens).

3 – Setup of a job that uses the container
Now we need provide (at least) two files: the job descrition file and the actual program to run.


#!/bin/sh

#SBATCH --job-name=GPU-Shared
#SBATCH --error=myjob.err
#SBATCH --output=myjob.out
#SBATCH --partition=allgroups
#SBATCH --ntasks=1
#SBATCH --mem=12G
#SBATCH --time=01:00:00
#SBATCH --constraint=dellcuda1
#SBATCH --gres=mps:1

SHAREDIR="/conf/shared-software/Singularity/CUDA/"

singularity exec -B \
	    ${SHAREDIR}/driver/`hostname`:/nvdriver \
	    ./container.sif \
	    ./container.test.py

4 – Run of the job
The command to run the job (on host dellcuda1 as can be see in the job file description) will be:

username@labsrv7:~/Job0$ sbatch myjob.slurm