Quick start guide

"Nothing is better than the wordless teaching and the advantage of non-action"

Basics

The cluster consists of the front-end and a number of computational nodes. The user submits a job from the front-end and the queueing system takes care to reserve requested resources and run the job when the time is optimal. An advanced job scheduling will assure fair access for everybody and optimize the overall system efficiency.

Please note that the cluster is build almost entirelly from grant funds. Scientists at CAMK agreed to contribute to the shared computational cluster instead of managing individually hardware purchased from grants. Please consider contribution when applying for grants...

Access

To use the cluster one has to 'ssh' to it's frontend node - chuck. It is accessible to each user from the internal network at Batycka - there are no separate accounts on the cluster. SSH keys are required. Still, to use the cluster you have to contact cluster@camk.edu.pl in order to

be assigned to a proper SLURM accounting group (camk employee or guest)
set up /work/chuck/<user> directory and quota (optionally).

(please indicate whether you are an employee, guest, student, member of some group and how much space you need - by default employees get 1TB).

The Rocky 9 system on chuck enforces higher security standards. Old, less secure ssh keys became deprecated. If you can't access chuck with old RSA keys please generate a new ED25519 keys. It will not affect the old RSA keys. If you don't have ED25519 keys, then on any linux machine execute:

ssh-keygen -t ed25519 # accept the default paths and set non-empty passphrase

cat ~/.ssh/id_ed25519.pub >> ~/.ssh/authorized_keys # to authorize the keys on any linux machine in the CAMK network

The frontend can be used to:

submit and monitor jobs,
develop codes (compilation and a short test runs).

The frontend is equiped with 40 cpu cores and 128 GB of memory. Please read the messages displayed after login - they contain current, important announcements. Do not run longer or memory hungry codes on the frontend. There are certain limits set on the frontend like 4 GB RAM/user and 3 h cpu time/process (see 'ulimit -a') and the system will kill proceses which violate them. Please remember that this machine is shared by many users - be kind to others.

Storage

There is a special high performance cluster filesystem (BeeGFS) attached to the cluster:

/work/chuck - 837 TB volume, (previous day mirror backup exists)

It should be used for all data intensive activities on the cluster because it is much faster there, than other 'works' at CAMK. It is also visible to all workstations (at lower thoughput).

Please note that the performance has priority over data safety on /work/chuck! Use it to store simulation/analysis results but not for the only copy of codes, papers, etc.

Please take into account:

/work/chuck is optimized for large files (ideally every single write should be larger than a few MB)
please avoid storing large number of small files (say - more than 1 mln)
there is plenty of space, you can always ask to increase your quota but please try to reduce the number of files:
- delete unnecessary files
- older projects can be tar-ed (no need to compress)
- data which are not needed anymore on the cluster should be moved e.g. to the camk central work filesystem (please ask adm to create /work/<username>)
when the user account at CAMK is closed, the data from /work/chuck/<username> will be deleted prmanently

Software

Aside from the standard linux packages there is some additional software available in a few ways:

Software collections contain newer gcc toolsets (the system provides gcc/gfortran 11, toolsets provide newer versions)
- to see the list of toolsets type `scl list-collections` , and `man scl` for other options
- to start e.g. a new bash shell with gcc 13 type `scl enable gcc-toolset-13 bash`)
Environmental modules provide an easy way to set environment for various software like Intel compiler, MPI and other libraries
- see `man module` for details
- `module av` lists available modules
- `module add <module name>` loads specific module
- `module purge` removes all modules from the current shell
Other, locally compiled software/libraries can will be added to the /opt directory (but usually there will be a corresponding module provided)
Python - please read this page

Important remarks:

for MPI codes the mvapich2 is strongly recommended since it can expoit our cluster's fast, Infiniband network
if you compile code in the environemt with some modules or software collections enabled, please remember to enable the same environment in the job script

Using the queueing system (running jobs)

In simplest words the cluster consists of the frontend which is used to submit jobs, and computational nodes on which the jobs are running. The software facilitating job and resource management is called a queueing system. Chuck uses SLURM queueing system. It's role is to queue jobs, manage resources and schedule jobs for execution. The basic principle is that the user's job gets an exclusive access to the requested resources. It means that user have to specify how e.g. many cpus, memory and time they need and then the scheduler will make a decision when and where to run the job. There are many rules and factors determining scheduling but in general they should assure equal access to available computational resources. The jobs are submitted to partitions which already provide coarse-grained limits on jobs. Currently existing partitions are:

Name	Max. mem	Default mem	Max. time	Default time	Notes
short	8 GB	1 GB	2 days	2 days	only for serial jobs (1 CPU), default partition
long	3 GB	1 GB	14 days	7 days
bigmem	60 GB	4 GB	7 days	7 days	max. 126 GB used by all running jobs of a given user
para	NONE	1 GB	7 days	7 days	only for parallel jobs (>1 CPU, can use multiple nodes)
gpu	NONE	8 GB	7 days	7 days	only for jobs using gpus
interactive	3 GB	2 GB	2 days	2 days	for interactive, serial jobs (e.g. compilation/debugging); interactive jobs can be however started on all other partitions

You can get more parameters of a partition using command: `scontrol show partiton <partition name>`.

There are also certain global limits like max. 256 cpu cores in total per user for all running jobs (32 cores for guests, does not apply for jobs waiting in the queue).

The standard workflow on the cluster consists of compilation of the code on the frontend and then submitting a job.

Warning! Be careful when compiling on chuck (frontend) with option -march=native. The frontend is newer than most of the nodes and the code compiled this way will fail on them. To be able to use all nodes use -march=sandybridge. You can optimize for newer architectures (see the hardware inventory table), e.g. -march=broadwell but then you have to request apriopriate nodes in the job script using option '-C broadwell'. If you don't use -march nor -mtune options the code will run everywhere.

Here is an example job script:

#! /bin/bash -l
## job name
#SBATCH -J testjob
## number of nodes
#SBATCH -N 1
#SBATCH --ntasks-per-node=1
#SBATCH --mem-per-cpu=1GB
#SBATCH --time=01:00:00
## partition to use
#SBATCH -p short
#SBATCH --output="stdout.txt"
#SBATCH --error="stderr.txt"

## commands/steps to execute
# go to the submission directory
cd $SLURM_SUBMIT_DIR
hostname > out.txt
my_code >> out.txt

and it is submited using command command 'sbatch <script_name>'. Please note that:

job should start with a shell line e.g. #! /bin/bash -l
all #SBATCH lines are comments for a shell but they are interpreted by sbatch as options
there is a large number of options to sbatch, every user should read `man sbatch`
ALL environment variables from the submission shell are passed by defult to the job.

After submission you will get a job ID - this is an important number used as parameter for many other SLURM commands, e.g. to cancel a job. Please give this number when asking for support.

Some other important SLURM commands (for details see relevant man pages):

sinfo - list available partitions and their status
squeue - list all jobs; `squeue -u <username>` will list only jobs of a given user
scancel <job id> - remove a job
sacct - accounting info about completed and running jobs
scontrol show partition - details of partitions
sshare -la - fairshare records per account and per user