Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
csc-training
GitHub Repository: csc-training/csc-env-eff
Path: blob/master/_slides/SRTFiles/04_Modules_SRT_English.srt
696 views
1
00:00:24,683 --> 00:00:27,300
Applications on the supercomputer are handled a bit 

2
00:00:27,300 --> 00:00:30,033
differently compared to a regular PC. 

3
00:00:30,750 --> 00:00:34,283
A supercomputer must fulfill the needs of different applications and 

4
00:00:34,283 --> 00:00:37,283
software used in different scientific purposes. 

5
00:00:39,149 --> 00:00:44,116
All of these software have also different dependencies and libraries they are linked to. 

6
00:00:44,700 --> 00:00:49,950
Some of the dependencies may also be conflicting which can create a lot of problems. 

7
00:00:50,633 --> 00:00:54,366
On a supercomputer you cannot load all the applications in the start-up

8
00:00:54,366 --> 00:00:56,299
as a normal PC does.

9
00:00:56,633 --> 00:01:00,116
Instead we separate the applications into modules. 

10
00:01:00,916 --> 00:01:04,849
Loading the module environment for a certain application sets up everything

11
00:01:04,849 --> 00:01:07,333
that is needed for that software. 

12
00:01:08,116 --> 00:01:10,616
A module loads the relevant libraries, 

13
00:01:10,616 --> 00:01:15,566
adjust path and other environment variables needed for the software.

14
00:01:16,866 --> 00:01:22,416
CSC supercomputers use Lmod framework for managing environment modules. 

15
00:01:31,133 --> 00:01:37,766
The application list in docs.csc.fi lists all installed  software on CSC supercomputers. 

16
00:01:38,383 --> 00:01:41,983
Check the documentation both when searching whether a spesific application

17
00:01:42,233 --> 00:01:44,783
is installed, or how to use one.

18
00:01:45,383 --> 00:01:48,266
All modules are loaded with command module load, 

19
00:01:48,266 --> 00:01:52,183
but what happens after that depends on the included software.

20
00:01:54,783 --> 00:01:57,900
The command module spider followed by a module name

21
00:01:57,900 --> 00:02:00,233
finds whether that module exists.

22
00:02:01,049 --> 00:02:03,583
For example to check if program called Gromacs 

23
00:02:03,583 --> 00:02:06,583
for molecular dynamics simulations exists,

24
00:02:06,583 --> 00:02:10,000
you can type module spider gromacs-env. 

25
00:02:17,250 --> 00:02:20,400
The general syntax for using modules is quite simple: 

26
00:02:20,416 --> 00:02:23,783
module, then the command and the module name.

27
00:02:24,550 --> 00:02:29,750
Please find a list of most common commands for using modules in the documentation. 

28
00:02:30,449 --> 00:02:35,683
There is commands like module load, module help, module list and so on. 

29
00:02:36,199 --> 00:02:39,666
Whether you are running an interactive job or submitting a batch job 

30
00:02:39,666 --> 00:02:43,616
you always need to load the appropriate modules to use your program.

31
00:02:44,283 --> 00:02:48,199
Always include the module commands in your batch script.

32
00:02:48,483 --> 00:02:53,333
That way you get always the same set of modules in consecutive batch jobs.

33
00:02:54,033 --> 00:02:58,283
It is a good practice to start with module purge - to have a clean slate -

34
00:02:58,283 --> 00:03:00,816
and then issue the needed module loads.

35
00:03:02,633 --> 00:03:07,983
The main idea in using modules is that you will not load all the modules simultaneously.

36
00:03:08,566 --> 00:03:12,050
That would cancel the whole idea of avoiding conflicting dependencies

37
00:03:22,766 --> 00:03:25,966
If you want to list which modules you have loaded at the moment, 

38
00:03:25,966 --> 00:03:28,050
you can type module list.

39
00:03:28,849 --> 00:03:33,099
To check which modules are available to load right now try module avail.

40
00:03:33,866 --> 00:03:36,966
It will not show modules which you can not load. 

41
00:03:37,566 --> 00:03:40,616
For example there might still be some missing dependencies 

42
00:03:40,616 --> 00:03:43,516
before you are able to load a particular module.

43
00:03:44,333 --> 00:03:47,533
To search for a certain application and its versions use 

44
00:03:47,533 --> 00:03:50,400
module spider and the name of the application.

45
00:03:51,199 --> 00:03:56,383
Also with module spider you can type the application name / the application version 

46
00:03:56,383 --> 00:04:00,949
to find more information on how to load a specific version of the application. 

47
00:04:01,683 --> 00:04:04,500
The system is designed so that it will give an error message if 

48
00:04:04,500 --> 00:04:07,366
you try to load a module which is not available.

49
00:04:15,216 --> 00:04:20,100
Some software have a graphical user interface, for example Molden.

50
00:04:20,883 --> 00:04:24,466
To use those you need a graphical session in Puhti web interface, 

51
00:04:24,466 --> 00:04:27,416
or some other way to access remote graphics.

52
00:04:28,933 --> 00:04:34,366
Some software work in the command line interface, for example zstd.

53
00:04:35,033 --> 00:04:39,666
You can start using them in the same command line after loading the module.

54
00:04:41,383 --> 00:04:45,066
Some applications on the supercomputer have their own modules,

55
00:04:45,066 --> 00:04:48,633
and some applications are combined in larger modules. 

56
00:04:49,350 --> 00:04:53,399
Then some applications can be found in multiple modules. 

57
00:04:54,983 --> 00:04:59,899
Some modules contain packages - for example python or R packages.

58
00:05:00,699 --> 00:05:04,750
You can use them with the appropriate software after loading the module.

59
00:05:05,649 --> 00:05:10,066
Avoid trial-and-errors by consulting the documentation first to find out if 

60
00:05:10,066 --> 00:05:13,949
the application you want to use is already included in a module.

61
00:05:20,783 --> 00:05:23,649
Conda is a generally used package management tool

62
00:05:23,649 --> 00:05:26,149
for distributing and installing software. 

63
00:05:27,016 --> 00:05:31,066
Some applications are installed and used in a Conda environment.

64
00:05:31,866 --> 00:05:37,033
Unfortunately Conda has a really poor performance in Lustre parallel file system.

65
00:05:37,883 --> 00:05:40,366
Therefore you should always check if the software you need 

66
00:05:40,366 --> 00:05:42,500
is already included in a module.

67
00:05:43,433 --> 00:05:47,483
If you need to build an own Conda environment, put it in a container.

68
00:05:48,383 --> 00:05:52,633
More information on installing software and containers in their own lectures.

69
00:05:53,566 --> 00:05:57,600
Python-related modules have their own Docs CSC pages.

70
00:05:58,399 --> 00:06:03,066
Consult those before starting because they have different instructions on usage.

71
00:06:03,866 --> 00:06:06,750
Plan your workflows such that you do not need to load

72
00:06:06,750 --> 00:06:10,100
multiple modules with Python packages simultaneously.

73
00:06:19,116 --> 00:06:23,399
Eventually you as a supercomputer user will develop your own workflows, 

74
00:06:23,399 --> 00:06:27,550
such as always loading the same modules that you use in your research.

75
00:06:28,399 --> 00:06:32,600
Usually in such cases people use a file called dot bashrc,

76
00:06:32,600 --> 00:06:34,633
which is executed in every login. 

77
00:06:35,633 --> 00:06:39,000
But we do not recommend this for loading the modules automatically, 

78
00:06:39,000 --> 00:06:42,783
because this might cause some hard-to-find issues later on.

79
00:06:43,583 --> 00:06:46,433
You should always be aware of what modules you are loading

80
00:06:46,433 --> 00:06:49,116
to avoid conflicts with the dependencies.

81
00:06:49,916 --> 00:06:53,233
If you already have done this and suspect some conflicts, 

82
00:06:53,250 --> 00:06:56,666
you can check out the csc-env command.

83
00:06:57,333 --> 00:07:03,416
That allows for example resetting to the basic environment which resets the bashrc file.

84
00:07:04,183 --> 00:07:08,983
However you can still use .bashrc to miminize the tedious typing.

85
00:07:09,766 --> 00:07:14,649
Automatize module loading by creating an alias in the .bashrc.

86
00:07:15,300 --> 00:07:19,500
Executing an alias will execute all the user-defined commands there.

87
00:07:20,316 --> 00:07:24,199
You can come up with your own command like for example setmyenv, 

88
00:07:24,199 --> 00:07:27,366
and then define all the module loads you usually need.

89
00:07:28,083 --> 00:07:31,733
Please keep in mind, that aliases might not work in a same way

90
00:07:31,733 --> 00:07:33,949
in a sub shell or batch job

91
00:07:40,183 --> 00:07:43,916
If you like a certain set of modules you can also save your current

92
00:07:43,933 --> 00:07:48,383
module configuration and then use module restore to load your set. 

93
00:07:48,949 --> 00:07:53,633
You can also write your own module files and put them in your home directory. 

94
00:07:54,300 --> 00:07:58,466
Remember also to add the path of your modulefiles to the module search path 

95
00:07:58,466 --> 00:08:01,550
so that you can easily use those own modules. 

96
00:08:02,333 --> 00:08:05,350
To study the existing module files you use the command 

97
00:08:05,350 --> 00:08:07,833
module show and the module name.

98
00:08:08,466 --> 00:08:13,833
That will show for example the filename of the module file which is a .lua file. 

99
00:08:14,483 --> 00:08:17,716
The tutorials about modules continue from here. 

100
00:08:18,449 --> 00:08:22,500
They cover the basic use cases with easy-to-follow examples.

101
00:08:23,083 --> 00:08:28,633
Module documentation in docs.csc.fi covers also the more technical details. 

102
00:08:29,383 --> 00:08:31,583
The links are in the description.