Submitting jobs and compiling programs on Magnolia cluster: Difference between revisions

From HPC Wiki
Jump to:navigation Jump to:search
(Created page with "{| |'''When:''' |''November 3rd, 2017. 1.00 pm — 3 pm'' |- |'''Where:''' |''[https://map.usm.edu/campus_map.php?id=26221 TEC 202]'' |- |colspan="2" style="text-align:cen...")
 
No edit summary
Line 28: Line 28:
Questions? Email [mailto:Brian.Olson@usm.edu?Subject=Linux%20Workshop%20Part%20I Brian Olson]
Questions? Email [mailto:Brian.Olson@usm.edu?Subject=Linux%20Workshop%20Part%20I Brian Olson]


= Workshop Notes =
The following is a a summary of topics covering in the workshop.
{{WIP|author=bgo}}
== Check hostname ==
Connecting to {{C|magnolia.usm.edu}} will result in access to one of two (2) login nodes, {{C|magnolia01}} or {{C|magnolia02}}, chosen randomly each time a connection is requested. The {{C|hostname}} command will show the host name of the system:
{{Cmd|hostname|output=<pre>magnolia01</pre>}}
== Create a Script/Program ==
To create a very simply script for {{C|bash}}, load up a next editor. To use the {{C|nano}} text editor, type:
{{Cmd|nano myscript.sh}}
Type the following, and save it.
{{FileBox|title=First script|filename=myscript.sh|lang=bash|1=#!/bin/sh
hostname}}
=== Mark as Executable, and Run ===
The script has been created, however it is currently a plain text file; to mark the script as executable, the {{C|chmod}} command is used.
{{Cmd|chmod a+x myscript.sh|./myscript.sh|output=<pre>magnolia01</pre>}}
=== Comments ===
Bash scripts allow for comments to be placed within the files by preceding them with a '#'
{{FileBox|title=First script with a comment|filename=myscript.sh|lang=bash|1=#!/bin/sh
# This is my first script, and a comment.
hostname}}
The script has previously been marked as executable, so it can be run directly.
{{Cmd|./myscript.sh|output=<pre>magnolia01</pre>}}
The output is the same as before, since comments are ignored by Bash.
== Slurm ==
The Slurm Workload Manager is used for submitting and monitoring jobs to the {{C|Magnolia}} cluster.
=== Partitions ===
Nodes of the {{C|Magnolia}} cluster is separated into partitions. These partitions can be viewed with the {{C|sinfo}} command.
{{Cmd|sinfo|output=<pre>PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
node*        up  infinite    69  alloc node[001-026,032-036,038-041,043-076]
node*        up  infinite      7  idle node[027-031,037,042]
gpu          up  infinite      2  idle gpu[001-002]
himem        up  infinite      4  idle himem[001-004]
phi          up  infinite      4  idle phi[001-004]</pre>}}
=== Job Information ===
The {{C|squeue}} command is used show infomation of jobs currently in the Slurm queue.
{{Cmd|squeue|output=<pre>  JOBID PARTITION    NAME    USER ST      TIME  NODES NODELIST(REASON)
    2151      node concorde    cpan  R 2-06:33:11    23 node[001-011,016,018,020,032-036,038-041]
    2162      node concorde    cpan  R 1-14:58:40    23 node[011-015,043-060]
    2208      node concorde    cpan  R    4:51:02    24 node[017,019,021-026,061-076]</pre>}}
=== Submit a Job ===
To submit a job, the {{C|sbatch}} command is used. The script made previously can be sent to the cluster for running:
{{Cmd|sbatch myscript.sh|output=<pre>Submitted batch job 2210</pre>}}
When the job is finished, the output will by default will be placed in a {{Path|slurm-####.out}} file, where '####' is the batch job number shown after the script was submitted with {{C|sbatch}}.
{{Cmd|cat slurm-2210.out|output=<pre>node028.cluster</pre>}}
{{Cmd|sbatch -n 2 -N 2 myscript.sh|output=<pre>Submitted batch job 2211</pre>}}
{{Cmd|cat slurm-2211.out|output=<pre>node028.cluster</pre>}}
{{FileBox|title=First script with a comment|filename=myscript.sh|lang=bash|1=#!/bin/sh
# This is my first script
srun hostname}}
{{Cmd|sbatch -n 2 -N 2 myscript.sh|output=<pre>Submitted batch job 2213</pre>}}
{{Cmd|cat slurm-2213.out|output=<pre>node028.cluster
node037.cluster</pre>}}
{{FileBox|title=First script with a comment|filename=myscript.sh|lang=bash|1=#!/bin/sh
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=2
#SBATCH --mail-user=myemail@example.com
##SBATCH --mail-type=END
#SBATCH --time=0-00:10:00
#SBATCH --job-name=hostname
# This is my first script
srun hostname}}
{{Cmd|sbatch myscript.sh|output=<pre>Submitted batch job 2214</pre>}}
{{Cmd|cat slurm-2214.out|output=<pre>node028.cluster
node028.cluster
node037.cluster
node037.cluster</pre>}}
== Modules ==
{{Cmd|module avail|output=<pre>
------------------------ /usr/share/Modules/modulefiles ------------------------
dot        module-git  module-info modules    null        use.own
------------------------------- /act/modulefiles -------------------------------
impi              mpich/intel        openmpi-1.6/gcc    openmpi-1.8/intel
intel              mvapich2-2.2/gcc  openmpi-1.6/intel  openmpi-2.0/gcc
mpich/gcc          mvapich2-2.2/intel openmpi-1.8/gcc    openmpi-2.0/intel
----------------------------- /modules/modulefiles -----------------------------
atlas/3.10.3    hdf5/1.8.19      mkl/2017.0.3    python/3.6.2
cmake/3.9.1      lammps/20170811  molpro/2012.1.52 qe/6.0
ffmpeg/3.3.3    lapack/3.7.1    netcdf/4.4.1.1  qe/6.1
fftw/3.3.6      libxc/4.0.1      python/3.5.4    scalapack/2.0.2
</pre>}}
{{Cmd|python3.5 --version|output=<pre>bash: python3.5: command not found...</pre>}}
{{Cmd|module load python/3.5.4}}
{{Cmd|python3.5 --version|output=<pre>Python 3.5.4></pre>}}
{{Cmd|module help python/3.5.4|output=<pre>----------- Module Specific Help for 'python/3.5.4' ---------------
Description - Python is a widely used high-level programming language for general-purpose programming.
Docs        - https://www.python.org/
</pre>}}
{{Cmd|module unload python/3.5.4}}




[[Category:Workshops]]
[[Category:Workshops]]

Revision as of 14:04, 10 November 2017

When: November 3rd, 2017. 1.00 pm — 3 pm
Where: TEC 202
Click here to register

Description

Topics in this workshop:

  1. Introduction to Magnolia cluster
  2. Using Modules to select programs
  3. Submitting/running jobs using SLURM scheduler
  4. Compiling MPI programs
  5. Compiling CUDA programs


Questions? Email Brian Olson

Workshop Notes

The following is a a summary of topics covering in the workshop.

Warning
This page is a work in progress by bgo (talk | contribs). Treat its contents with caution.

Check hostname

Connecting to magnolia.usm.edu will result in access to one of two (2) login nodes, magnolia01 or magnolia02, chosen randomly each time a connection is requested. The hostname command will show the host name of the system:

user $hostname
magnolia01

Create a Script/Program

To create a very simply script for bash, load up a next editor. To use the nano text editor, type:

user $nano myscript.sh

Type the following, and save it.

FILE myscript.shFirst script
#!/bin/sh

hostname


Mark as Executable, and Run

The script has been created, however it is currently a plain text file; to mark the script as executable, the chmod command is used.

user $chmod a+x myscript.sh
user $./myscript.sh
magnolia01

Comments

Bash scripts allow for comments to be placed within the files by preceding them with a '#'

FILE myscript.shFirst script with a comment
#!/bin/sh

# This is my first script, and a comment.

hostname

The script has previously been marked as executable, so it can be run directly.

user $./myscript.sh
magnolia01

The output is the same as before, since comments are ignored by Bash.

Slurm

The Slurm Workload Manager is used for submitting and monitoring jobs to the Magnolia cluster.

Partitions

Nodes of the Magnolia cluster is separated into partitions. These partitions can be viewed with the sinfo command.

user $sinfo
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
node*        up   infinite     69  alloc node[001-026,032-036,038-041,043-076]
node*        up   infinite      7   idle node[027-031,037,042]
gpu          up   infinite      2   idle gpu[001-002]
himem        up   infinite      4   idle himem[001-004]
phi          up   infinite      4   idle phi[001-004]

Job Information

The squeue command is used show infomation of jobs currently in the Slurm queue.

user $squeue
   JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
    2151      node concorde     cpan  R 2-06:33:11     23 node[001-011,016,018,020,032-036,038-041]
    2162      node concorde     cpan  R 1-14:58:40     23 node[011-015,043-060]
    2208      node concorde     cpan  R    4:51:02     24 node[017,019,021-026,061-076]

Submit a Job

To submit a job, the sbatch command is used. The script made previously can be sent to the cluster for running:

user $sbatch myscript.sh
Submitted batch job 2210

When the job is finished, the output will by default will be placed in a slurm-####.out file, where '####' is the batch job number shown after the script was submitted with sbatch.

user $cat slurm-2210.out
node028.cluster
user $sbatch -n 2 -N 2 myscript.sh
Submitted batch job 2211
user $cat slurm-2211.out
node028.cluster
FILE myscript.shFirst script with a comment
#!/bin/sh

# This is my first script

srun hostname
user $sbatch -n 2 -N 2 myscript.sh
Submitted batch job 2213
user $cat slurm-2213.out
node028.cluster
node037.cluster
FILE myscript.shFirst script with a comment
#!/bin/sh

#SBATCH --nodes=2
#SBATCH --ntasks-per-node=2
#SBATCH --mail-user=myemail@example.com
##SBATCH --mail-type=END
#SBATCH --time=0-00:10:00
#SBATCH --job-name=hostname

# This is my first script

srun hostname
user $sbatch myscript.sh
Submitted batch job 2214
user $cat slurm-2214.out
node028.cluster
node028.cluster
node037.cluster
node037.cluster


Modules

user $module avail
------------------------ /usr/share/Modules/modulefiles ------------------------
dot         module-git  module-info modules     null        use.own

------------------------------- /act/modulefiles -------------------------------
impi               mpich/intel        openmpi-1.6/gcc    openmpi-1.8/intel
intel              mvapich2-2.2/gcc   openmpi-1.6/intel  openmpi-2.0/gcc
mpich/gcc          mvapich2-2.2/intel openmpi-1.8/gcc    openmpi-2.0/intel

----------------------------- /modules/modulefiles -----------------------------
atlas/3.10.3     hdf5/1.8.19      mkl/2017.0.3     python/3.6.2
cmake/3.9.1      lammps/20170811  molpro/2012.1.52 qe/6.0
ffmpeg/3.3.3     lapack/3.7.1     netcdf/4.4.1.1   qe/6.1
fftw/3.3.6       libxc/4.0.1      python/3.5.4     scalapack/2.0.2
user $python3.5 --version
bash: python3.5: command not found...
user $module load python/3.5.4
user $python3.5 --version
Python 3.5.4>
user $module help python/3.5.4
----------- Module Specific Help for 'python/3.5.4' ---------------

Description - Python is a widely used high-level programming language for general-purpose programming.
Docs        - https://www.python.org/
user $module unload python/3.5.4