Path: blob/master/part-1/disk-areas/maindisks.md
1229 views
------Where to store files in CSC's computing environment?
In this tutorial you
Familiarize yourself with personal and project-specific disk areas and their quotas on CSC supercomputers.
Learn how to share your files, such as software installations and data, to other project members on CSC supercomputers.
💬 Each user of CSC supercomputers (Puhti and Mahti) have access to different disk areas (or directories) for managing their data. Each disk area has its own specific purpose.
💬 Active data files needed for computational simulations and analyses should be stored and shared in directories under /scratch while any software installations and binaries should be shared under the /projappl directory.
Identify your personal and project-specific directories on Puhti and Mahti supercomputers
First login to Puhti using SSH (or by opening a login node shell in the Puhti web interface):
Get an overview of your projects and directories by running the following commands on the login node:
Inspect the output information summarizing your directories and their current quotas.
Visit your project's
/scratchdirectory and list its contents:Visit your project's
/projappldirectory and list its contents:
💬 These directories can be briefly summarized as follows:
User-specific directory (i.e. your personal home folder)
Your home directory (path stored in environment variable
$HOME)The default directory when you login to Puhti/Mahti
You can store configuration files and other minor data for personal use
Project-specific directories:
The project's
/scratchand/projappldirectoriesEach project has its own
/scratchdisk space where most computational tasks are performed. The/scratcharea is a temporary space not intended for long-term data storage! Please move inactive data to e.g. Allas./projappldirectory on the other hand is mainly for storing and sharing compiled applications and libraries etc. with other members of the project.
Sharing binaries and data files
💬 Data transfer between two supercomputers can be done e.g. with rsync.
Download the example files
☝🏻 In this example you will download data from Allas object storage.
Move to your home folder:
💡 If you know the files are large, you should consider downloading them directly to
/scratch.Download an example program package (
ggplot2_3.3.3_Rprogramme.tar.gz) and a data file (Merged.fasta) from the Allas object storage
Let's assume that
Merged.fastais a data file intended for computational useggplot2_3.3.3_Rprogramme.tar.gzis a software tool needed for the analysis.
Move the files to Puhti /scratch and /projappl
Create folders with your username (using environment variable
$USER) in your project directories under/scratchand/projapplon Puhti.Copy your
ggplot2_3.3.3_Rprogramme.tar.gzfile to the/projappldirectoryCopy the
Merged.fastafile to the/scratchdirectoryNote that all new files and directories are also fully accessible to other members of the project (including read and write permissions).
Set read-only permissions for your project members for the file
Merged.fasta:
Copying files from Puhti to Mahti (optional)
☝🏻 For this part you must ensure you have forwarded your SSH agent to Puhti, otherwise you will not be able to connect to Mahti.
Check if your SSH keys are available on Puhti using command
ssh-add -L.If true, it will print your public key. Proceed to step 4.
If not:
Linux/macOS: Log out and log back in using
ssh -Aoption.Windows: Log out. Toggle option Allow agent forwarding found under "Session" -> "SSH" -> "Advanced SSH settings" -> "Expert SSH settings" (MobaXterm) or under "Connection" -> "SSH" -> "Auth" (PuTTY) before connecting again.
Change to the folder where you have the example files
Copy
Merged.fastafile from Puhti to the/scratchdrive of Mahti:Copy the
ggplot2_3.3.3_Rprogramme.tar.gzfile from Puhti to the/projappldirectory on Mahti:
More information
💡 Hint: You can use your folder under /scratch for the rest of the tutorials. You can save the path using an alias (with cd or echo) or somewhere in your notes.
💡 It is sometimes required to export the paths of the /scratch or /projappl directories in environmental variables (until logout). This can be done with the following commands: