---
---
Installing own software {.title}
This topic is about installing your own software on the CSC servers.
The one-slide lecture
It is possible for users to install software on the CSC supercomputers
If you don't know how or run into problems while trying, contact [email protected]
For LUMI-related queries, contact the LUMI User Support Team
Code categories
Start by reading the software documentation
The installation method depends on code type (or category)
Instructions found online rarely work as "copy/paste" in an HPC environment
Before doing a lot of work, check if an alternative software is already available in the CSC application list
Also check with
module spider
Binaries
If you have ready-made binaries, you can simply try to run them
The problem with ready-made binaries is that they are hardly ever optimal for the computer they are used on
Especially MPI codes should always be re-compiled for best performance
Ready binaries can be considered if
The source code is not available
The software is compiled on an identical computer
The software is for relatively light (serial or threaded) computation
Interpreted languages
Examples of high-level interpreted languages are
Python, Java, Perl, R, etc.
These languages do not need to be compiled, but they often can be
Running these programs usually requires loading a suitable module for the language
Loading modules ensures also that the software will run the same on the compute nodes
High-performance computing languages
Programming languages that need to be compiled
Typical examples are e.g. C, C++ and Fortran
Most resource-intensive software have been programmed using these
As a researcher, you typically only need to compile a software (unless available pre-installed)
Can sometimes be complicated
If you run into problems, contact [email protected]
About compilers and profiling
A compiler is a special program that reads, analyses and translates a human-readable source code into a machine-readable object code
It performs 4 steps: Lexical analysis, syntactic and semantic analysis, optimization and output code generation
Compilers target specific operating systems and computer architectures and are usually programming language-specific
Code profiling: Analysis of an application (memory, CPU, network utilized) to understand its performance
Checking how much time is spent in different software routines is important to identify performance bottlenecks (don't optimize before this!)
Some general notes
No
sudo
available for users on the CSC supercomputersYou can't use package managers (
apt
,yum
, etc.)You can't install into "standard" locations (
/usr/bin
,/usr/lib
, etc.)Set the installation directory to
/projappl
or similar
Start by loading a suitable compiler suite or language module
Many commonly used HPC libraries (e.g. OpenMPI, ScaLAPACK, FFTW) are available as modules (search with
module spider
)
Compile on the fast local disk (
$TMPDIR
) to avoid stressing LustreNew software is not automatically added to
$PATH
Include the full path or add with
export PATH="/path/to/my/sw:$PATH"
Installation methods: Native installations
Installing directly to the system
Usually the preferred way for software with few or no dependencies
Installation methods: Containers
Containerization is an efficient method to install software and their dependencies
Very easy if a ready-made container is available
Recommended particularly for software with complex dependencies
Installation methods: Conda
Conda is a common installation system, but it is very problematic on HPC systems
Creates a huge number of files and leads to poor performance on the Lustre parallel file system
Installations easily break when the system changes
Containerization is required if you intend to use Conda environments on CSC supercomputers (see usage policy)
Wrapping Conda installations into a container alleviates problems since the number of files is dramatically decreased from the FS point of view
CSC has created a tool called Tykky which does the containerization automatically and transparently
Testing -- it's important to test first
Construct a batch job script for a short and simple test run
Use known example/benchmark data provided, e.g., by the code developer (if you did not develop the code yourself)
Run a tutorial provided with the software
Run your test in the
test
queue or in an interactive session directly from the command-lineCompare performance and results to existing data (your old data, online references, etc.)
More information
See the tutorials for each category for more detailed instructions
Check the Docs CSC pages:
Lots of information online
Try searching for any error messages you come across