Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
csc-training
GitHub Repository: csc-training/csc-env-eff
Path: blob/master/_slides/03_disk_areas.md
696 views
---
theme: csc-eurocc-2019 lang: en
---

Disk areas in CSC's HPC environment {.title}

In this section, you will learn how to work in different disk areas in CSC's HPC environment

![](https://mirrors.creativecommons.org/presskit/buttons/88x31/png/by-sa.png)
All materials (c) 2020-2025 by CSC – IT Center for Science Ltd. This work is licensed under a **Creative Commons Attribution-ShareAlike** 4.0 Unported License, [http://creativecommons.org/licenses/by-sa/4.0/](http://creativecommons.org/licenses/by-sa/4.0/)

Overview of disk areas

Disk and storage overview

Main disk areas in Puhti/Mahti

  • Home directory ($HOME)

    • Other users cannot access your home directory

  • ProjAppl directory (/projappl/project_name)

    • Shared with project members

    • Possible to limit access (chmod g-rw) to subfolders

  • Scratch directory (/scratch/project_name)

    • Shared with project members

    • Files older than 180 days will be automatically removed

  • These directories reside on the Lustre parallel file system

  • Default quotas and more info in disk areas section of Docs CSC

Moving data between and to/from supercomputers

Displaying current status of disk areas

  • Use the csc-workspaces command to show available projects and quotas

{width=50%}

Disk and storage overview (revisited)

Additional fast local disk areas

  • $TMPDIR on login nodes

    • Each of the login nodes have 2900 GiB of fast local storage in $TMPDIR

    • The local disk is meant for temporary storage (e.g. compiling software) and is cleaned frequently

  • NVMe disks on some compute nodes on Puhti and Mahti

    • Interactive, I/O and GPU nodes have fast local disks (NVMe) in $LOCAL_SCRATCH

    • You must copy data to and from the fast disk during your batch job since the NVMe is accessible only during your job allocation

    • If your job reads and/or writes a lot of small files, using this can give a huge performance boost!

What are the different disk areas for?

Best practices

  • None of the disk areas are automatically backed up by CSC, so make sure to perform regular backups to, e.g., Allas

  • Don't run databases or Conda on Lustre (/projappl, /scratch, $HOME)

    • Containerize Conda environments with Tykky and use other CSC services like Pukki, cPouta or Rahti for databases

  • Don't create a lot of files, especially within a single folder

    • If you're creating 10 000+ files, you should probably rethink your workflow

  • Consider using fast local disks when working with many small files

  • Lustre best practices and efficient I/O in high-throughput workflows