skip to primary navigationskip to content
 

User environment and modules

Default environment

The operating system on all Darwin nodes currently is Scientific Linux release 7 (a rebuild of Red Hat Enterprise Linux 7, aka RHEL7).

The user environment is set up using environment modules. A default module is loaded automatically upon login to a CSD3 node. These default modules autoload a collection of other modules which will configure the environment to allow compilation of applications and submission of jobs to the particular cluster, by providing access to the required utilities and recommended development and MPI software (these differ between Peta4 and Wilkes2).

It is possible to change the environment which is loaded upon login, by editing the shell initialisation file ~/.bashrc. Note that this will effect all future login sessions, and all batch jobs not yet started. Since some modules are required for proper operation of the account, caution is needed before removing any autoloaded modules. Changes to the shell initialisation file will take effect during the next login or at the creation of the next shell. At any time one can check the currently loaded modules via:

module list

which currently will produce the following output by default on a login-cpu (Peta4-Skylake) login node:

Currently Loaded Modulefiles:
1) dot 10) intel/impi/2017.4/intel
2) slurm 11) intel/libs/idb/2017.4
3) java/jdk1.8.0_45 12) intel/libs/tbb/2017.4
4) turbovnc/2.0.1 13) intel/libs/ipp/2017.4
5) vgl/2.5.1/64 14) intel/libs/daal/2017.4
6) singularity/current 15) intel/bundles/complib/2017.4
7) rhel7/global 16) rhel7/default-peta4
8) intel/compilers/2017.4 17) cluster-tools/2.0.5
9) intel/mkl/2017.4 18) bacula/5.2.13

 From this list it is possible to see, for example, that the version of the Intel development software currently available is 2017.4.

Please note that modules work by setting environment variables such as PATH and LD_LIBRARY_PATH. Therefore if you need to modify these variables directly it is essential to retain the original values to avoid breaking loaded modules (and potentially rendering essential software ''not found'') - e.g. do

export PATH=$PATH:/home/abc123/custom_bin_directory
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/abc123/custom_lib_directory

and not

export PATH=/home/abc123/custom_bin_directory
export LD_LIBRARY_PATH=/home/abc123/custom_lib_directory

Modules

On a complex computer system, on which it is necessary to make available a wide choice of software packages in multiple versions, it can be quite difficult to set up the user environment so as to always find the required executables and libraries. (This is particularly true where different implementations or versions use the same names for files). Environment modules provide a way to selectively activate and deactivate modifications to the user environment which allow particular packages and versions to be found.

The basic command to use is module:

module 		
   (no arguments)              print usage instructions
   avail or av                 list available software modules 
   whatis                      as above with brief descriptions
   load <modulename>           add a module to your environment
   unload <modulename>         remove a module
   purge                       remove all modules

Enter the command module avail to see the entire collection of currently available modules.

Some modules refer to administrative software and are not of interest to users, and some modules load other modules. It is possible to make use of various versions of Intel compiler and parallel libraries by explicitly loading some of the above modules. By default the login environment loads several modules required for the normal operation of the account: e.g. the default versions of the Intel development software, batch scheduling system and the recommended MPI for the particular flavour of compute node hardware. One can list the modules actually loaded by issuing the command module list.

For historical reasons there may be modules with sandybridge, westmere and nehalem in their name. These were originally created to provide optimised versions of software specifically for older CPU types. Please check that there is not a more recent version of the same module without these labels before using one of these modules. In particular, recent modules built specifically for CSD3 were created using Spack and can be identified by the hash string at the end of the module name. E.g.  -4qrgkot in the case of 

cfitsio-3.410-intel-17.0.4-4qrgkot 

Since the home directory is shared by all nodes, the same default environment is inherited when a user's job commences via the queueing system. It is good practice to explicitly set up the module state in the submission script to remove ambiguity - the default template scripts under /usr/local/Cluster-Docs/SLURM/ contain a section dealing with this.

Making your own modules

It is possible to create new modules and add them to your environment. For example, after installing an application in your own filespace, create a personal module directory called~/privatemodules (recall that ~ is UNIX shorthand for your home directory). Then enable this directory by adding the following line to your ~/.bashrc file (this will take effect for future shells, issue the same command on the command line to affect the current shell):

module load use.own    

Now module files created under ~/privatemodules will be recognised as modules.

The contents of a module file look like this (see /usr/local/Cluster-Config/modulefiles for more examples):

#%Module -*- tcl -*-
##
## modulefile
##
proc ModulesHelp { } {

  puts stderr "\tAdds Intel C/C++ compilers (11.0.081) to your environment,"
}

module-whatis "adds Intel C/C++ compilers (11.0.081) to your environment"

set               root                 /usr/local/Cluster-Apps/intel/cce/11.0.081
prepend-path      PATH                 $root/bin/intel64
prepend-path      MANPATH              $root/man
prepend-path      LD_LIBRARY_PATH      $root/lib/intel64

...

Typically, one would set at least PATH, MANPATH and LD_LIBRARY_PATH to pick up the appropriate application directories. For more information, please load the modules module (module load modules), and refer to the module and modulefile man pages (i.e. man module and man modulefile).

A good approach is to create a subdirectory within the modules directory for each separate application and to place the new module in that directory, named to reflect the version number of the application. For example the name of the Intel C compiler module file above is /usr/local/Cluster-Config/modulefiles/intel/cce/11.0.081. The command module load intel/cce will automatically try to load the largest version number it finds under /usr/local/Cluster-Config/modulefiles/intel/cce (NB highest in lexical ordering, which is often but not necessarily the same as numerical ordering: e.g. 8 is higher than 1 and therefore also higher than 10).

Spack

The old modules visible to earlier clusters are still available but new modules are now being generated by  Spack. The module names contain hashes uniquely identifying the build options and dependencies but the modules can be searched and located using the spack command.

E.g. to search for modules providing HDF5 1.10.1 compiled using GCC, search for existing builds using

spack find -v hdf5@1.10.1%gcc

which will list matching builds and also show which configure options were on (+) or off (~) . Less version information will show more choices, e.g. one might start off with just

spack find hdf5%gcc

Full details of any matching builds including dependencies (e.g. which MPI was used) can be extracted with

spack find -dvl hdf5@1.10.1%gcc@5.4.0

Existing builds have modules which can be identified with a command such as

spack module find hdf5@1.10.1+mpi %gcc@5.4.0 ^openmpi@1.10.7

which returns the name of the corresponding module (hdf5-1.10.1-gcc-5.4.0-i52euam). Note that is sometimes quicker to search using the short hash form identifying each build (printed at the start of line of output by the spack find -dvl command). E.g.

spack module find /i52euam

(notice the / introducing the hash).

Missing builds can be requested. (For more information about Spack see http://spack.readthedocs.io/en/latest/index.html.)