Submitting jobs and compiling programs on Magnolia cluster

From HPC Wiki
Jump to:navigation Jump to:search

Description

Topics in this workshop:

  1. Introduction to Magnolia cluster
  2. Using Modules to select programs
  3. Submitting/running jobs using SLURM scheduler
  4. Compiling MPI programs
  5. Compiling CUDA programs


Questions? Email Brian Olson

Workshop Notes

The following is a a summary of topics covered in the workshop.

Warning
This page is a work in progress by bgo (talk | contribs). Treat its contents with caution.

Check hostname

Connecting to magnolia.usm.edu will result in access to one of two (2) login nodes, magnolia01 or magnolia02, chosen randomly each time a connection is requested. The hostname command will show the host name of the system:

user $hostname
magnolia01

Create a Script/Program

To create a very simply script for bash, load up a next editor. To use the nano text editor, type:

user $nano myscript.sh

Type the following, and save it.

FILE myscript.shFirst script
#!/bin/sh

hostname


Mark as Executable, and Run

The script has been created, however it is currently a plain text file; to mark the script as executable, the chmod command is used.

user $chmod a+x myscript.sh
user $./myscript.sh
magnolia01

Comments

Bash scripts allow for comments to be placed within the files by preceding them with a '#'

FILE myscript.shFirst script with a comment
#!/bin/sh

# This is my first script, and a comment.

hostname

The script has previously been marked as executable, so it can be run directly.

user $./myscript.sh
magnolia01

The output is the same as before, since comments are ignored by Bash.

Slurm

The Slurm Workload Manager is used for submitting and monitoring jobs to the Magnolia cluster.

Partitions

Nodes of the Magnolia cluster is separated into partitions. These partitions can be viewed with the sinfo command.

user $sinfo
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
node*        up   infinite     69  alloc node[001-026,032-036,038-041,043-076]
node*        up   infinite      7   idle node[027-031,037,042]
gpu          up   infinite      2   idle gpu[001-002]
himem        up   infinite      4   idle himem[001-004]
phi          up   infinite      4   idle phi[001-004]

Job Information

The squeue command is used show infomation of jobs currently in the Slurm queue.

user $squeue
   JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
    2151      node concorde     cpan  R 2-06:33:11     23 node[001-011,016,018,020,032-036,038-041]
    2162      node concorde     cpan  R 1-14:58:40     23 node[011-015,043-060]
    2208      node concorde     cpan  R    4:51:02     24 node[017,019,021-026,061-076]

Submit a Job

To submit a job, the sbatch command is used. The script made previously can be sent to the cluster for running:

user $sbatch myscript.sh
Submitted batch job 2210

When the job is finished, the output will by default will be placed in a slurm-####.out file, where '####' is the batch job number shown after the script was submitted with sbatch.

user $cat slurm-2210.out
node028.cluster
user $sbatch -n 2 -N 2 myscript.sh
Submitted batch job 2211
user $cat slurm-2211.out
node028.cluster
FILE myscript.shFirst script with a comment
#!/bin/sh

# This is my first script

srun hostname
user $sbatch -n 2 -N 2 myscript.sh
Submitted batch job 2213
user $cat slurm-2213.out
node028.cluster
node037.cluster
FILE myscript.shFirst script with a comment
#!/bin/sh

#SBATCH --nodes=2
#SBATCH --ntasks-per-node=2
#SBATCH --mail-user=myemail@example.com
##SBATCH --mail-type=END
#SBATCH --time=0-00:10:00
#SBATCH --job-name=hostname

# This is my first script

srun hostname
user $sbatch myscript.sh
Submitted batch job 2214
user $cat slurm-2214.out
node028.cluster
node028.cluster
node037.cluster
node037.cluster


Modules

user $module avail
------------------------ /usr/share/Modules/modulefiles ------------------------
dot         module-git  module-info modules     null        use.own

------------------------------- /act/modulefiles -------------------------------
impi               mpich/intel        openmpi-1.6/gcc    openmpi-1.8/intel
intel              mvapich2-2.2/gcc   openmpi-1.6/intel  openmpi-2.0/gcc
mpich/gcc          mvapich2-2.2/intel openmpi-1.8/gcc    openmpi-2.0/intel

----------------------------- /modules/modulefiles -----------------------------
atlas/3.10.3     hdf5/1.8.19      mkl/2017.0.3     python/3.6.2
cmake/3.9.1      lammps/20170811  molpro/2012.1.52 qe/6.0
ffmpeg/3.3.3     lapack/3.7.1     netcdf/4.4.1.1   qe/6.1
fftw/3.3.6       libxc/4.0.1      python/3.5.4     scalapack/2.0.2
user $python3.5 --version
bash: python3.5: command not found...
user $module load python/3.5.4
user $python3.5 --version
Python 3.5.4>
user $module help python/3.5.4
----------- Module Specific Help for 'python/3.5.4' ---------------

Description - Python is a widely used high-level programming language for general-purpose programming.
Docs        - https://www.python.org/
user $module unload python/3.5.4

Example Job using LAMMPS

Using the Lustre Storage

Array Jobs

Compiling MPI Programs

Compiling CUDA Programs

Compiling a Package

Download and Extract Package

Interactive Job

Configure Package

Build and Install Package

Python Modules

pip

virtualenv