Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
csc-training
GitHub Repository: csc-training/csc-env-eff
Path: blob/master/_slides/07_allas.md
696 views
---
theme: csc-eurocc-2019 lang: en
---

Allas object storage service {.title}

This topic is about using Allas and storing data.

![](https://mirrors.creativecommons.org/presskit/buttons/88x31/png/by-sa.png)
All materials (c) 2020-2025 by CSC – IT Center for Science Ltd. This work is licensed under a **Creative Commons Attribution-ShareAlike** 4.0 Unported License, [http://creativecommons.org/licenses/by-sa/4.0/](http://creativecommons.org/licenses/by-sa/4.0/)

The Allas object storage: what is it?

  • Allas is a Ceph-based object storage service for all CSC computing and cloud services

  • Possible to upload data from personal laptops or organizational storage systems into Allas

  • Meant for data storage during project lifetime

    • All project members have equal access to the data in Allas

    • Default quota is 10 TB per project

  • Clients available on Puhti and Mahti

The Allas object storage: what it is NOT

  • Allas is not a file system (even though many tools try to fool you to think so)

    • It is just a place for static data objects

  • Allas is not a data management environment

    • Tools for search, metadata, version control and access management are minimal

  • Allas is not a proper backup service

    • Project members can delete all the data with just one command

Storing files in Allas

  • An object is stored on multiple servers

    • A disk or server failure does not cause data loss

  • There is no backup, i.e. if a file is accidentally deleted, it cannot be recovered

  • Data cannot be modified in the object storage

    • For computation, the data has to be typically copied to a file system on some computer

  • Some data management features are built on top of Allas

  • Data can be shared publicly to the Internet, which is otherwise not easily possible at CSC

Allas buckets

  • Storage space in Allas is provided per CSC project

  • The project space can have multiple buckets (up to 1000)

    • Some sources refer to buckets as containers

      • Must not be confused with Docker/Apptainer containers!

  • The name of the bucket must be unique within Allas

Allas objects

  • Data is stored as objects within a bucket

    • Objects can contain any type of data (generally, object == file)

    • Objects have metadata that can be enriched

  • In Allas, you can have 500 000 objects per bucket

  • There is only one level of hierarchy of buckets (no buckets within buckets)

    • There is no hierarchical directory structure, although it sometimes looks like that

Allas supports two protocols

  • S3 (used by s3cmd, rclone, a-tools)

  • Swift (used by swift, rclone, a-tools, cyberduck)

  • Authentication and file handling is different for the protocols

  • Avoid cross-using Swift and S3-based objects!

Allas clients

Allas -- first steps

Allas -- rclone

  • Straightforward power-user tool with a wide range of features

  • Fast and efficient

  • Available for Linux, Mac and Windows

  • Overwrites and removes data without asking!

Allas -- a-tools

  • a-tools provide an easy and safe way to use Allas for occasional Allas users

  • Default bucket names are based on directories on Puhti/Mahti

  • Unlike rclone, a-tools does not overwrite or remove data without asking!

  • Developed for the CSC supercomputers, but you can install the tools in other Linux and Mac machines as well

  • Automatic packing (compression can be enabled as well if needed)

  • a-commands instructions at Docs CSC

Issues with Allas

  • 8-hour connection limit with swift

  • No way to check quota

  • Moving data inside Allas is not possible (swift)

  • No way to freeze data

    • Use two projects if you need to prevent others from editing your data

  • Different interfaces may work in different ways

Questions that users should consider

  • Should I store each file as a separate object, or should I collect them into bigger chunks?

    • In general: consider how you use the data

  • Should I use compression?

  • Who can use the data: projects and access rights?

  • What will happen to my data later on?

  • How to keep track of all the data I have in Allas?

Bonus: Sensitive data services

  • CSC Sensitive Data Services for processing sensitive data

  • SD Desktop is a secure virtual desktop

    • Controlled access

    • Data importing only through the SD Connect service

    • Isolation from the Internet

    • No direct data export

  • Allas can be used for sensitive data, but only if the data is properly encrypted!

Bonus: Fairdata services

  • https://www.fairdata.fi -- Services to manage scientific data according to FAIR principles

  • Suitable for all static digital research material and related metadata

  • Free of charge for users in Finnish higher education institutions and research institutes

  • IDA: storage for research data

  • Qvain: Describe your dataset and get a persistent indentifier for it

  • Etsin: Discover datasets based on metadata