Path: blob/master/_slides/SRTFiles/03_DiskAreas_SRT_English_mac.srt
696 views
1 00:00:23,649 --> 00:00:29,366 CSC supercomputers have different disk areas that have their own specific purposes. 2 00:00:29,833 --> 00:00:36,299 This lecture gives insight on where to keep your files and how to move the data between Mahti and Puhti. 3 00:00:36,833 --> 00:00:39,733 The disk ares have different quotas which means that 4 00:00:39,733 --> 00:00:43,583 users cannot just create files anywhere and forget them. 5 00:00:44,366 --> 00:00:50,899 The additional fast disk areas are for applications that require lots of file input and output. 6 00:00:55,133 --> 00:00:59,283 Here is a visual layout of two supercomputers and the file storage. 7 00:01:00,266 --> 00:01:04,900 The box on the left represents Puhti and the other on the right Mahti. 8 00:01:05,516 --> 00:01:08,900 Down below is a cylinder that represents Allas. 9 00:01:09,733 --> 00:01:12,283 There are bidirectional arrows between these three objects 10 00:01:12,283 --> 00:01:16,583 illustrating that data can be transferred from each service to another. 11 00:01:17,299 --> 00:01:20,583 Both supercomputers have their own Lustre filesystem 12 00:01:20,583 --> 00:01:24,533 that contain disk areas called home, projappl and scratch. 13 00:01:25,150 --> 00:01:30,183 You can move files inside a supercomputer using commands mv and cp. 14 00:01:30,816 --> 00:01:35,849 For file transfers between Puhti and Mahti you can use rsync or scp. 15 00:01:36,433 --> 00:01:40,916 And for using Allas object storage there is a separate lecture and tutorial. 16 00:01:41,799 --> 00:01:47,183 It is important to realize that if you have your files for example in Puhti scratch area 17 00:01:47,183 --> 00:01:51,433 they will not show anywhere in Mahti unless you manually copy them there. 18 00:01:56,783 --> 00:02:01,450 As pictured in the previous slide, the main disk areas in both Puhti and Mahti 19 00:02:01,450 --> 00:02:05,433 are the user's home directory, scratch and projappl. 20 00:02:05,883 --> 00:02:09,416 The home directory is personal, but projappl and scratch are 21 00:02:09,416 --> 00:02:13,233 for the project's use and they are shared with your project members. 22 00:02:13,400 --> 00:02:18,266 In scratch disk area there will be an automatic file removal system. 23 00:02:18,633 --> 00:02:22,383 To keep your data and results safe move them for example to Allas 24 00:02:22,383 --> 00:02:24,650 during the lifetime of your project. 25 00:02:25,566 --> 00:02:29,316 These disk areas reside in the Lustre parallel file system 26 00:02:29,316 --> 00:02:33,550 which makes them visible in both the login nodes and the compute nodes. 27 00:02:33,849 --> 00:02:36,933 If you want to know how the Lustre file system works you can find the link to Lustre documentation in the slides. 28 00:02:36,933 --> 00:02:40,783 you can find the link to Lustre documentation in the slides. 29 00:02:41,483 --> 00:02:46,083 Lustre is good in handling lots of data at once, but not well suited for 30 00:02:46,083 --> 00:02:49,983 repeatedly searching and reading small chunks of files. 31 00:02:50,500 --> 00:02:56,516 The documentation in docs.csc.fi covers also the default quotas for the disk areas. 32 00:02:56,516 --> 00:03:00,566 There you can find more information about the default quotas. 33 00:03:01,516 --> 00:03:06,349 For example in home directory you can not have more than 10 GiB of space 34 00:03:06,349 --> 00:03:09,949 and you can only have less than hundred thousand files in there. 35 00:03:15,833 --> 00:03:19,866 Puhti and Mahti do not share any disk space. 36 00:03:19,900 --> 00:03:25,733 You can copy your files to a different supercomputer using rsync or scp commands. 37 00:03:25,833 --> 00:03:30,783 You can also put your data in Allas and download it from there to other supercomputers. 38 00:03:31,733 --> 00:03:35,766 You will learn more about Allas in the later presentations and tutorials. 39 00:03:41,666 --> 00:03:47,016 You can check the status of the disk areas that you have with command csc-workspaces. 40 00:03:47,866 --> 00:03:53,283 It prints a list of your disk areas with current capacity and the number of files used. 41 00:03:54,000 --> 00:03:59,233 Each project have their own projappl and scratch areas and they all will be printed out for you. 42 00:04:00,133 --> 00:04:03,433 The first line is your own personal project. 43 00:04:03,849 --> 00:04:13,849 In this example the user has 8 GB of files against 11 GB of quota and 17000 files against the 100000 files that is allowed. 44 00:04:14,616 --> 00:04:18,183 Be aware that you may reach this 100000 file limit 45 00:04:18,183 --> 00:04:22,083 if you install for example many conda environments. 46 00:04:22,250 --> 00:04:24,716 Please pay attention to the disk usage, 47 00:04:24,716 --> 00:04:28,566 as well as the number of files that you are using in your disk areas. 48 00:04:28,566 --> 00:04:34,866 Note that with this command you can also check the paths to your projappl and scratch areas. 49 00:04:39,350 --> 00:04:44,666 Here is the visual layout of two supercomputers and their file storage again. 50 00:04:46,149 --> 00:04:51,433 In addition to the main disk areas the supercomputers have fast local disk areas. 51 00:04:52,233 --> 00:04:55,633 These mean the temp directories in login nodes and 52 00:04:55,633 --> 00:04:59,333 then NVMe directories on some compute nodes in Puhti. 53 00:05:03,933 --> 00:05:09,050 Remember that the Lustre filesystem is not good in reading or writing a lot of small files? 54 00:05:09,899 --> 00:05:13,566 That is what the local fast disk areas are for. 55 00:05:14,033 --> 00:05:16,949 In login nodes it is called temporary directory and 56 00:05:16,949 --> 00:05:22,216 you can access it from the login node with command cd $TMPDIR. 57 00:05:23,250 --> 00:05:30,750 The temp directory is meant for preprocessing of your files. For example if you have too many files and you want to merge them. 58 00:05:31,683 --> 00:05:35,100 Don't do any heavy computing in the login nodes. 59 00:05:36,683 --> 00:05:40,133 The fast local disk areas in some compute nodes in Puhti 60 00:05:40,133 --> 00:05:43,316 are NVMe-based disks i.e. SSD-disks. 61 00:05:43,483 --> 00:05:47,516 You can use them as part of an interactive job or a batch job. 62 00:05:48,233 --> 00:05:52,416 The environment variable used for accessing the NVMe-disks 63 00:05:52,416 --> 00:05:55,050 during a job is $LOCAL_SCRATCH. 64 00:05:55,783 --> 00:06:00,399 You must copy the data in and out the NVMe space during your batch job 65 00:06:00,399 --> 00:06:04,649 because the fast local disk areas are not part of the Lustre filesystem. 66 00:06:05,483 --> 00:06:08,833 Otherwise the data and results produced during the batch job 67 00:06:08,833 --> 00:06:12,149 will not be accessible to you after the job has finished. 68 00:06:13,066 --> 00:06:19,250 You should consider using the NVMe space if your job reads or writes lots of small files. 69 00:06:20,016 --> 00:06:24,783 Depending on the amount of files you might get up to 10 times faster performance 70 00:06:24,783 --> 00:06:30,233 if you use the local fast disk area compared to the Lustre filesystem. 71 00:06:34,966 --> 00:06:39,683 This slide shows a summary of the disk areas and their specific usecases. 72 00:06:40,550 --> 00:06:45,816 Allas is for storing data that is not used in any computational analysis at the moment. 73 00:06:46,583 --> 00:06:50,533 Home directory is especially meant for storing your personal information 74 00:06:50,533 --> 00:06:53,316 and some scripts that you would like to use later. 75 00:06:53,616 --> 00:06:57,016 Other users cannot access your home folder. 76 00:06:58,033 --> 00:07:01,783 The scratch area should be your main working space for your scientific work 77 00:07:01,783 --> 00:07:04,850 and it is also shared with project members. 78 00:07:05,633 --> 00:07:09,216 In projappl users can store their scientific programs, 79 00:07:09,216 --> 00:07:12,633 binaries and packages with colleagues in the same project. 80 00:07:13,366 --> 00:07:16,783 The temp directory in login node is used for compiling code 81 00:07:16,783 --> 00:07:19,800 and fast file I/O in lightweight processing. 82 00:07:20,600 --> 00:07:27,116 NVMe disks are used in heavy computing for fast file I/O in batch jobs or interactive jobs. 83 00:07:32,550 --> 00:07:35,833 Do not put any databases on the Lustre filesystem. 84 00:07:35,916 --> 00:07:38,083 It is not good to have too many files and 85 00:07:38,083 --> 00:07:41,533 Lustre is inefficient in reading small bits of a single file. 86 00:07:42,516 --> 00:07:45,583 For databases we recommend that you use our services 87 00:07:45,583 --> 00:07:48,483 like Kaivos and MongoDB in cPouta. 88 00:07:49,133 --> 00:07:52,083 Don't create a lot of files in one folder, 89 00:07:52,083 --> 00:07:56,550 or actually don't create overall a lot of files in Lustre filesystem. 90 00:07:57,383 --> 00:08:00,566 If that is needed you have to rethink the workflow. 91 00:08:01,183 --> 00:08:04,966 Please note that CSC does not backup your data. 92 00:08:05,516 --> 00:08:07,416 The disks are fault tolerant, 93 00:08:07,416 --> 00:08:11,800 but that does not stop you or your colleaques accidentally deleting everything. 94 00:08:12,616 --> 00:08:17,633 A pro-tip: modify the usage rights to files with chmod. 95 00:08:19,583 --> 00:08:22,933 If your workflow requres reading or writing lot of files, 96 00:08:22,933 --> 00:08:27,399 use the fast local disks which will increase the performance for your computation, 97 00:08:27,399 --> 00:08:30,750 as well as decrease the load on Lustre filesystem. 98 00:08:31,333 --> 00:08:35,500 The documentation in docs.csc.fi contains best practice tips 99 00:08:35,500 --> 00:08:37,850 for performance when using Lustre. 100 00:08:38,566 --> 00:08:41,799 The tutorials about disk areas continue from here. 101 00:08:41,799 --> 00:08:45,733 They cover the basic use cases with easy-to-follow examples. 102 00:08:46,433 --> 00:08:49,149 Check the links in the description!