Update 202405
From HPC Wiki
Update to Rocky Linux 9. CentOS discontinued and CentOS 7 reaching end of life (EOL) June 30, 2024.
Roadmap
Topic | Completed? | Description |
---|---|---|
Install new compute node hardware - Part I | Yes | Vendor Installed |
Install new compute node hardware - Part II | No | To be shipped |
Install operating system on new compute nodes | Yes | Using Rocky Linux 9 since CentOS 7 reaching end of life (EOL) on June 30, 2024. Took a lot of time getting the compute nodes with the new Rocky Linux operating system to integrate with the much older CentOS operating system nodes; Finally done. |
Update backup and cloning system | Yes | The backup and cloning software that has been used on the cluster does not work with Rocky Linux, so had to design a package that will work with both Rocky Linux and CentOS. |
Update Slurm | Yes | Updated to a version which works on both Rocky Linux 9 and CentOS 7. Required very short downtime. |
Add new compute nodes to Slurm partitions | Yes | The first two partitions are currently only available to those who purchased the nodes:
All partitions will be available to user when preemption (see below) is enabled. |
Update module setup | No | WIP |
Update modules for Rocky Linux 9 | No | WIP |
Install CUDA drivers for new GPU compute nodes | Yes | |
Preemption partitions | No | TODO |
Update CentOS 7 compute nodes to Rocky Linux 9 | No | TODO |
Update login nodes to Rocky Linux 9 | No | TODO |
Slurm Partitions
user $
coresavail
Number of nodes in partition with N available cores and RAM. Nodes Partition Available Cores Available RAM (MiB) 6 kisame 64 515134 3 suliaoma 32 257094 4 himem 32 2063430 62 node 20 128000 4 himem 20 512000 2 gpu 20 128000 17 lomem 16 64000
The himem partition has 4 compute nodes with 32 cores available and just under 2TiB RAM free. There are another 4 himem compute nodes free, but these have 20 cores and 50.5 TiB RAM free.
Modules
Updating to new modules layout to allow for easier upgrades in the future.
New Layout
user $
module load newsetup
user $
module avail
------------------------ /usr/share/Modules/modulefiles ------------------------ dot module-git module-info modules null use.own --------------------------- /modules/node/common/MPI --------------------------- impi/2017.4.196 ------------------------ /modules/node/common/Programs ------------------------- bamtools/2.5.1 ffmpeg/4.4 bedtools/2.31.0 gaussian/16B.01-avx2 bowtie/1.2.2 gaussian/16C.01-avx2 bowtie2/2.3.4.3 gaussian/16C.01-LINDA-avx2 bowtie2/2.5.1 nco/4.7.6 cmake/3.15.3 nco/4.9.3 cmake/3.24.2(default) openssl/1.0.2k cmake/3.9.1 openssl/3.0.9 fastx-toolkit/0.0.14 salmon/0.12.0 ffmpeg/3.3.3 salmon/1.1.0 ------------------------ /modules/node/common/Libraries ------------------------ isl/0.22 mpfr/4.0.1 trimmomatic/0.39 lapack/3.7.1 ncurses/5.9 x264/20171213 libjpeg-turbo/2.1.5.1 newsetup/0.1 zstd/1.5.5 libpng/1.5.30 openssl/1.0.2k mkl/2017.0.3 openssl/3.0.9 ------------------------ /modules/node/common/Languages ------------------------ cuda-toolkit/10.1.243(default) gcc/6.4.0 cuda-toolkit/11.6.2 gcc/7.3.0 cuda-toolkit/8.0.61 gcc/8.3.0 gcc/11.4.0 gcc/9.2.0 gcc/12.3.0 intel/2017.4.196 gcc/13.2.0 ----------------------------- /modules/node/Types ------------------------------ centos-7-63/0.1 centos-7-79/0.1 rocky-9.3-143/0.1 thisnode/0.1
Common modules that work on both Rocky Linux 9 and CentOS 7.
Rocky Linux Modules
user $
module load rocky-9.3-143
user $
module avail
------------------------ /usr/share/Modules/modulefiles ------------------------ dot module-git module-info modules null use.own --------------------------- /modules/node/common/MPI --------------------------- impi/2017.4.196 ------------------------ /modules/node/common/Programs ------------------------- bamtools/2.5.1 ffmpeg/4.4 bedtools/2.31.0 gaussian/16B.01-avx2 bowtie/1.2.2 gaussian/16C.01-avx2 bowtie2/2.3.4.3 gaussian/16C.01-LINDA-avx2 bowtie2/2.5.1 nco/4.7.6 cmake/3.15.3 nco/4.9.3 cmake/3.24.2(default) openssl/1.0.2k cmake/3.9.1 openssl/3.0.9 fastx-toolkit/0.0.14 salmon/0.12.0 ffmpeg/3.3.3 salmon/1.1.0 ------------------------ /modules/node/common/Libraries ------------------------ isl/0.22 mpfr/4.0.1 trimmomatic/0.39 lapack/3.7.1 ncurses/5.9 x264/20171213 libjpeg-turbo/2.1.5.1 newsetup/0.1 zstd/1.5.5 libpng/1.5.30 openssl/1.0.2k mkl/2017.0.3 openssl/3.0.9 ------------------------ /modules/node/common/Languages ------------------------ cuda-toolkit/10.1.243(default) gcc/6.4.0 cuda-toolkit/11.6.2 gcc/7.3.0 cuda-toolkit/8.0.61 gcc/8.3.0 gcc/11.4.0 gcc/9.2.0 gcc/12.3.0 intel/2017.4.196 gcc/13.2.0 ----------------------------- /modules/node/Types ------------------------------ centos-7-63/0.1 centos-7-79/0.1 rocky-9.3-143/0.1 thisnode/0.1 ----------------------- /modules/node/rocky-9.3-143/MPI ------------------------ openmpi-gcc/4.1.5 --------------------- /modules/node/rocky-9.3-143/Programs --------------------- lammps/20220623 -------------------- /modules/node/rocky-9.3-143/Libraries --------------------- hdf5/1.12.1 hdf5/1.8.19 netcdf/4.4.1.1 netcdf/4.8.1