Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
csc-training
GitHub Repository: csc-training/csc-env-eff
Path: blob/master/_slides/SRTFiles/04_Modules_SRT_English_mac.srt
696 views
1
00:00:23,149 --> 00:00:25,766
Applications on the supercomputer are handled a bit 

2
00:00:25,766 --> 00:00:28,500
differently compared to a regular PC. 

3
00:00:29,216 --> 00:00:32,750
A supercomputer must fulfill the needs of different applications and 

4
00:00:32,750 --> 00:00:35,750
software used in different scientific purposes. 

5
00:00:37,616 --> 00:00:42,383
All of these software have also different dependencies and libraries they are linked to. 

6
00:00:43,166 --> 00:00:48,416
Some of the dependencies may also be conflicting which can create a lot of problems. 

7
00:00:49,100 --> 00:00:52,466
On a supercomputer you cannot load all the applications in the boot 

8
00:00:52,466 --> 00:00:54,399
as with a normal PC. 

9
00:00:55,100 --> 00:00:58,583
Instead we separate the applications into modules. 

10
00:00:59,383 --> 00:01:03,666
Loading the module environment for a certain application sets up everything

11
00:01:03,666 --> 00:01:05,799
that is needed for that software. 

12
00:01:06,583 --> 00:01:09,083
A module loads the relevant libraries, 

13
00:01:09,083 --> 00:01:14,033
adjust path and other environment variables needed for the software to work. 

14
00:01:15,616 --> 00:01:21,166
CSC supercomputers use Lmod framework for managing environment modules. 

15
00:01:30,283 --> 00:01:36,916
The application list in docs.csc.fi lists all installed  software on CSC supercomputers. 

16
00:01:37,783 --> 00:01:41,383
Check the documentation when searching whether a spesific application

17
00:01:41,383 --> 00:01:43,933
is installed, or how to use one.

18
00:01:44,533 --> 00:01:47,416
All modules are loaded with command module load, 

19
00:01:47,416 --> 00:01:51,333
but what happens after that depends on the included software.

20
00:01:53,933 --> 00:01:57,049
The command module spider followed by a module name

21
00:01:57,049 --> 00:01:59,383
finds whether that module exists.

22
00:02:00,200 --> 00:02:02,733
For example to check if program called Gromacs 

23
00:02:02,733 --> 00:02:05,733
for molecular dynamics simulations exists,

24
00:02:05,733 --> 00:02:09,150
you can type module spider gromacs-env. 

25
00:02:16,400 --> 00:02:19,550
The general syntax for using modules is quite simple: 

26
00:02:19,566 --> 00:02:22,933
module, then the command and the module name.

27
00:02:23,699 --> 00:02:28,900
Please find a list of most common commands for using modules in the documentation. 

28
00:02:29,599 --> 00:02:34,833
There is commands like module load, module help, module list and so on. 

29
00:02:35,349 --> 00:02:38,816
Whether you are running an interactive job or submitting a batch job 

30
00:02:38,816 --> 00:02:42,766
you always need to load the appropriate modules to use your program.

31
00:02:43,433 --> 00:02:47,349
Always include the module commands in your batch script.

32
00:02:47,633 --> 00:02:52,483
That way you get always the same set of modules in consecutive batch jobs.

33
00:02:53,183 --> 00:02:57,433
It is a good practice to start with module purge - to have a clean slate -

34
00:02:57,433 --> 00:02:59,966
and then issue the needed module loads.

35
00:03:01,783 --> 00:03:07,133
The main idea in using modules is that you will not load all the modules simultaneously.

36
00:03:07,716 --> 00:03:11,199
That would cancel the whole idea of avoiding conflicting dependencies

37
00:03:11,199 --> 00:03:14,449
by separating the applications into modules. 

38
00:03:21,916 --> 00:03:25,116
If you want to list which modules you have loaded at the moment, 

39
00:03:25,116 --> 00:03:27,199
you can type module list.

40
00:03:28,000 --> 00:03:32,250
To check which modules are available to load right now try module avail.

41
00:03:33,016 --> 00:03:36,116
It will not show modules which you can not load. 

42
00:03:36,716 --> 00:03:39,766
For example there might still be some missing dependencies 

43
00:03:39,766 --> 00:03:42,666
before you are able to load a particular module.

44
00:03:43,483 --> 00:03:46,683
To search for a certain application and its versions use 

45
00:03:46,683 --> 00:03:49,550
module spider and the name of the application.

46
00:03:50,349 --> 00:03:55,533
Also with module spider you can type the application name / the application version 

47
00:03:55,533 --> 00:04:00,099
to find more information on how to load a specific version of the application. 

48
00:04:00,833 --> 00:04:03,650
The system is designed so that it will give an error message if 

49
00:04:03,650 --> 00:04:06,516
you try to load a module which is not available.

50
00:04:14,366 --> 00:04:19,250
Some software have a graphical user interface, for example Molden.

51
00:04:20,033 --> 00:04:23,616
To use those you need a graphical session in Puhti web interface, 

52
00:04:23,616 --> 00:04:26,566
or some other way to access remote graphics.

53
00:04:28,083 --> 00:04:33,516
Some software work in the command line interface, for example zstd.

54
00:04:34,183 --> 00:04:38,816
You can start using them in the same command line after loading the module.

55
00:04:40,533 --> 00:04:44,216
Some applications on the supercomputer have their own modules,

56
00:04:44,216 --> 00:04:47,783
and some applications are combined in larger modules. 

57
00:04:48,500 --> 00:04:52,550
Then some applications can be found in multiple modules. 

58
00:04:54,133 --> 00:04:59,050
Some modules contain packages - for example python or R packages.

59
00:04:59,850 --> 00:05:03,899
You can use them with the appropriate software after loading the module.

60
00:05:04,800 --> 00:05:09,216
Avoid trial-and-errors by consulting the documentation first to find out if 

61
00:05:09,216 --> 00:05:13,100
the application you want to use is already included in a module.

62
00:05:19,933 --> 00:05:22,800
Conda is a generally used package management tool

63
00:05:22,800 --> 00:05:25,300
for distributing and installing software. 

64
00:05:26,166 --> 00:05:30,216
Some applications are installed and used in a Conda environment.

65
00:05:31,016 --> 00:05:36,183
Unfortunately Conda has a really poor performance in Lustre parallel file system.

66
00:05:37,033 --> 00:05:39,516
Therefore you should always check if the software you need 

67
00:05:39,516 --> 00:05:41,649
is already included in a module.

68
00:05:42,583 --> 00:05:46,633
If you need to build an own Conda environment, put it in a container.

69
00:05:47,533 --> 00:05:51,783
More information on installing software and containers in their own lectures.

70
00:05:52,716 --> 00:05:56,750
Python-related modules have their own Docs CSC pages.

71
00:05:57,550 --> 00:06:02,216
Consult those before starting because they have different instructions on usage.

72
00:06:03,016 --> 00:06:05,899
Plan your workflows such that you do not need to load

73
00:06:05,899 --> 00:06:09,250
multiple modules with Python packages simultaneously.

74
00:06:15,966 --> 00:06:20,250
Eventually you as a supercomputer user will develop your own workflows, 

75
00:06:20,250 --> 00:06:24,399
such as always loading the same modules that you use in your research.

76
00:06:25,250 --> 00:06:29,449
Usually in such cases people use a file called dot bashrc,

77
00:06:29,449 --> 00:06:31,483
which is executed in every login. 

78
00:06:32,483 --> 00:06:35,850
But we do not recommend this for loading the modules automatically, 

79
00:06:35,850 --> 00:06:39,633
because this might cause some hard-to-find issues later on.

80
00:06:40,433 --> 00:06:43,283
You should always be aware of what modules you are loading

81
00:06:43,283 --> 00:06:45,966
to avoid conflicts with the dependencies.

82
00:06:46,766 --> 00:06:50,083
If you already have done this and suspect some conflicts, 

83
00:06:50,100 --> 00:06:53,516
you can check out the csc-env command.

84
00:06:54,183 --> 00:07:00,266
That allows for example resetting to the basic environment which resets the bashrc file.

85
00:07:01,033 --> 00:07:05,833
However you can still use .bashrc to miminize the tedious typing.

86
00:07:06,616 --> 00:07:11,500
Automatize module loading by creating an alias in the .bashrc.

87
00:07:12,149 --> 00:07:16,350
Executing an alias will execute all the user-defined commands there.

88
00:07:17,166 --> 00:07:21,050
You can come up with your own command like for example setmyenv, 

89
00:07:21,050 --> 00:07:24,216
and then define all the module loads you usually need.

90
00:07:29,949 --> 00:07:33,683
If you like a certain set of modules you can also save your current

91
00:07:33,699 --> 00:07:38,149
module configuration and then use module restore to load your set. 

92
00:07:38,716 --> 00:07:43,399
You can also write your own module files and put them in your home directory. 

93
00:07:44,066 --> 00:07:48,233
Remember also to add the path of your modulefiles to the module search path 

94
00:07:48,233 --> 00:07:51,316
so that you can easily use those own modules. 

95
00:07:52,100 --> 00:07:55,116
To study the existing module files you use the command 

96
00:07:55,116 --> 00:07:57,600
module show and the module name.

97
00:07:58,233 --> 00:08:03,600
That will show for example the filename of the module file which is a .lua file. 

98
00:08:04,250 --> 00:08:07,483
The tutorials about modules continue from here. 

99
00:08:08,216 --> 00:08:12,266
They cover the basic use cases with easy-to-follow examples.

100
00:08:12,850 --> 00:08:18,399
Module documentation in docs.csc.fi covers also the more technical details. 

101
00:08:19,149 --> 00:08:21,350
The links are in the description.