Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
csc-training
GitHub Repository: csc-training/csc-env-eff
Path: blob/master/_slides/SRTFiles/03_DiskAreas_SRT_English_mac.srt
696 views
1
00:00:23,649 --> 00:00:29,366
CSC supercomputers have different disk areas that have their own specific purposes.

2
00:00:29,833 --> 00:00:36,299
This lecture gives insight on where to keep your files and how to move the data between Mahti and Puhti. 

3
00:00:36,833 --> 00:00:39,733
The disk ares have different quotas which means that 

4
00:00:39,733 --> 00:00:43,583
users cannot just create files anywhere and forget them. 

5
00:00:44,366 --> 00:00:50,899
The additional fast disk areas are for applications that require lots of file input and output. 

6
00:00:55,133 --> 00:00:59,283
Here is a visual layout of two supercomputers and the file storage.

7
00:01:00,266 --> 00:01:04,900
The box on the left represents Puhti and the other on the right Mahti. 

8
00:01:05,516 --> 00:01:08,900
Down below is a cylinder that represents Allas.

9
00:01:09,733 --> 00:01:12,283
There are bidirectional arrows between these three objects 

10
00:01:12,283 --> 00:01:16,583
illustrating that data can be transferred from each service to another.

11
00:01:17,299 --> 00:01:20,583
Both supercomputers have their own Lustre filesystem 

12
00:01:20,583 --> 00:01:24,533
that contain disk areas called home, projappl and scratch.

13
00:01:25,150 --> 00:01:30,183
You can move files inside a supercomputer using commands mv and cp.

14
00:01:30,816 --> 00:01:35,849
For file transfers between Puhti and Mahti you can use rsync or scp.

15
00:01:36,433 --> 00:01:40,916
And for using Allas object storage there is a separate lecture and tutorial.

16
00:01:41,799 --> 00:01:47,183
It is important to realize that if you have your files for example in Puhti scratch area 

17
00:01:47,183 --> 00:01:51,433
they will not show anywhere in Mahti unless you manually copy them there. 

18
00:01:56,783 --> 00:02:01,450
As pictured in the previous slide, the main disk areas in both Puhti and Mahti

19
00:02:01,450 --> 00:02:05,433
are the user's home directory, scratch and projappl. 

20
00:02:05,883 --> 00:02:09,416
The home directory is personal, but projappl and scratch are 

21
00:02:09,416 --> 00:02:13,233
for the project's use and they are shared with your project members. 

22
00:02:13,400 --> 00:02:18,266
In scratch disk area there will be an automatic file removal system. 

23
00:02:18,633 --> 00:02:22,383
To keep your data and results safe move them for example to Allas

24
00:02:22,383 --> 00:02:24,650
during the lifetime of your project.

25
00:02:25,566 --> 00:02:29,316
These disk areas reside in the Lustre parallel file system 

26
00:02:29,316 --> 00:02:33,550
which makes them visible in both the login nodes and the compute nodes. 

27
00:02:33,849 --> 00:02:36,933
If you want to know how the Lustre file system works you can find the link to Lustre documentation in the slides. 

28
00:02:36,933 --> 00:02:40,783
you can find the link to Lustre documentation in the slides. 

29
00:02:41,483 --> 00:02:46,083
Lustre is good in handling lots of data at once, but not well suited for 

30
00:02:46,083 --> 00:02:49,983
repeatedly searching and reading small chunks of files.

31
00:02:50,500 --> 00:02:56,516
The documentation in docs.csc.fi covers also the default quotas for the disk areas.

32
00:02:56,516 --> 00:03:00,566
There you can find more information about the default quotas.

33
00:03:01,516 --> 00:03:06,349
For example in home directory you can not have more than 10 GiB of space

34
00:03:06,349 --> 00:03:09,949
and you can only have less than hundred thousand files in there.

35
00:03:15,833 --> 00:03:19,866
Puhti and Mahti do not share any disk space. 

36
00:03:19,900 --> 00:03:25,733
You can copy your files to a different supercomputer using rsync or scp commands.

37
00:03:25,833 --> 00:03:30,783
You can also put your data in Allas and download it from there to other supercomputers. 

38
00:03:31,733 --> 00:03:35,766
You will learn more about Allas in the later presentations and tutorials.

39
00:03:41,666 --> 00:03:47,016
You can check the status of the disk areas that you have with command csc-workspaces.

40
00:03:47,866 --> 00:03:53,283
It prints a list of your disk areas with current capacity and the number of files used. 

41
00:03:54,000 --> 00:03:59,233
Each project have their own projappl and scratch areas and they all will be printed out for you.

42
00:04:00,133 --> 00:04:03,433
The first line is your own personal project.

43
00:04:03,849 --> 00:04:13,849
In this example the user has 8 GB of files against 11 GB of quota and 17000 files against the 100000 files that is allowed.

44
00:04:14,616 --> 00:04:18,183
Be aware that you may reach this 100000 file limit 

45
00:04:18,183 --> 00:04:22,083
if you install for example many conda environments.

46
00:04:22,250 --> 00:04:24,716
Please pay attention to the disk usage, 

47
00:04:24,716 --> 00:04:28,566
as well as the number of files that you are using in your disk areas.

48
00:04:28,566 --> 00:04:34,866
Note that with this command you can also check the paths to your projappl and scratch areas.

49
00:04:39,350 --> 00:04:44,666
Here is the visual layout of two supercomputers and their file storage again.

50
00:04:46,149 --> 00:04:51,433
In addition to the main disk areas the supercomputers have fast local disk areas. 

51
00:04:52,233 --> 00:04:55,633
These mean the temp directories in login nodes and 

52
00:04:55,633 --> 00:04:59,333
then NVMe directories on some compute nodes in Puhti.

53
00:05:03,933 --> 00:05:09,050
Remember that the Lustre filesystem is not good in reading or writing a lot of small files? 

54
00:05:09,899 --> 00:05:13,566
That is what the local fast disk areas are for.

55
00:05:14,033 --> 00:05:16,949
In login nodes it is called temporary directory and 

56
00:05:16,949 --> 00:05:22,216
you can access it from the login node with command cd $TMPDIR.

57
00:05:23,250 --> 00:05:30,750
The temp directory is meant for preprocessing of your files. For example if you have too many files and you want to merge them.

58
00:05:31,683 --> 00:05:35,100
Don't do any heavy computing in the login nodes. 

59
00:05:36,683 --> 00:05:40,133
The fast local disk areas in some compute nodes in Puhti 

60
00:05:40,133 --> 00:05:43,316
are NVMe-based disks i.e. SSD-disks. 

61
00:05:43,483 --> 00:05:47,516
You can use them as part of an interactive job or a batch job. 

62
00:05:48,233 --> 00:05:52,416
The environment variable used for accessing the NVMe-disks 

63
00:05:52,416 --> 00:05:55,050
during a job is $LOCAL_SCRATCH. 

64
00:05:55,783 --> 00:06:00,399
You must copy the data in and out the NVMe space during your batch job 

65
00:06:00,399 --> 00:06:04,649
because the fast local disk areas are not part of the Lustre filesystem. 

66
00:06:05,483 --> 00:06:08,833
Otherwise the data and results produced during the batch job

67
00:06:08,833 --> 00:06:12,149
will not be accessible to you after the job has finished.

68
00:06:13,066 --> 00:06:19,250
You should consider using the NVMe space if your job reads or writes lots of small files.

69
00:06:20,016 --> 00:06:24,783
Depending on the amount of files you might get up to 10 times faster performance

70
00:06:24,783 --> 00:06:30,233
if you use the local fast disk area compared to the Lustre filesystem.

71
00:06:34,966 --> 00:06:39,683
This slide shows a summary of the disk areas and their specific usecases.

72
00:06:40,550 --> 00:06:45,816
Allas is for storing data that is not used in any computational analysis at the moment.

73
00:06:46,583 --> 00:06:50,533
Home directory is especially meant for storing your personal information 

74
00:06:50,533 --> 00:06:53,316
and some scripts that you would like to use later. 

75
00:06:53,616 --> 00:06:57,016
Other users cannot access your home folder.

76
00:06:58,033 --> 00:07:01,783
The scratch area should be your main working space for your scientific work 

77
00:07:01,783 --> 00:07:04,850
and it is also shared with project members. 

78
00:07:05,633 --> 00:07:09,216
In projappl users can store their scientific programs, 

79
00:07:09,216 --> 00:07:12,633
binaries and packages with colleagues in the same project. 

80
00:07:13,366 --> 00:07:16,783
The temp directory in login node is used for compiling code

81
00:07:16,783 --> 00:07:19,800
and fast file I/O in lightweight processing.

82
00:07:20,600 --> 00:07:27,116
NVMe disks are used in heavy computing for fast file I/O in batch jobs or interactive jobs. 

83
00:07:32,550 --> 00:07:35,833
Do not put any databases on the Lustre filesystem.

84
00:07:35,916 --> 00:07:38,083
It is not good to have too many files and

85
00:07:38,083 --> 00:07:41,533
Lustre is inefficient in reading small bits of a single file. 

86
00:07:42,516 --> 00:07:45,583
For databases we recommend that you use our services 

87
00:07:45,583 --> 00:07:48,483
like Kaivos and MongoDB in cPouta.

88
00:07:49,133 --> 00:07:52,083
Don't create a lot of files in one folder,

89
00:07:52,083 --> 00:07:56,550
or actually don't create overall a lot of files in Lustre filesystem. 

90
00:07:57,383 --> 00:08:00,566
If that is needed you have to rethink the workflow. 

91
00:08:01,183 --> 00:08:04,966
Please note that CSC does not backup your data. 

92
00:08:05,516 --> 00:08:07,416
The disks are fault tolerant,

93
00:08:07,416 --> 00:08:11,800
but that does not stop you or your colleaques accidentally deleting everything.

94
00:08:12,616 --> 00:08:17,633
A pro-tip: modify the usage rights to files with chmod.

95
00:08:19,583 --> 00:08:22,933
If your workflow requres reading or writing lot of files, 

96
00:08:22,933 --> 00:08:27,399
use the fast local disks which will increase the performance for your computation, 

97
00:08:27,399 --> 00:08:30,750
as well as decrease the load on Lustre filesystem. 

98
00:08:31,333 --> 00:08:35,500
The documentation in docs.csc.fi contains best practice tips 

99
00:08:35,500 --> 00:08:37,850
for performance when using Lustre.

100
00:08:38,566 --> 00:08:41,799
The tutorials about disk areas continue from here. 

101
00:08:41,799 --> 00:08:45,733
They cover the basic use cases with easy-to-follow examples.

102
00:08:46,433 --> 00:08:49,149
Check the links in the description!