---
---
Allas object storage service {.title}
This topic is about using Allas and storing data.
The Allas object storage: what is it?
Allas is a Ceph-based object storage service for all CSC computing and cloud services
Possible to upload data from personal laptops or organizational storage systems into Allas
Meant for data storage during project lifetime
All project members have equal access to the data in Allas
Default quota is 10 TB per project
Clients available on Puhti and Mahti
See Docs CSC for instructions on accessing Allas from LUMI
The Allas object storage: what it is NOT
Allas is not a file system (even though many tools try to fool you to think so)
It is just a place for static data objects
Allas is not a data management environment
Tools for search, metadata, version control and access management are minimal
Allas is not a proper backup service
Project members can delete all the data with just one command
Storing files in Allas
An object is stored on multiple servers
A disk or server failure does not cause data loss
There is no backup, i.e. if a file is accidentally deleted, it cannot be recovered
Data cannot be modified in the object storage
For computation, the data has to be typically copied to a file system on some computer
Some data management features are built on top of Allas
Data can be shared publicly to the Internet, which is otherwise not easily possible at CSC
Allas buckets
Storage space in Allas is provided per CSC project
The project space can have multiple buckets (up to 1000)
Some sources refer to buckets as containers
Must not be confused with Docker/Apptainer containers!
The name of the bucket must be unique within Allas
Allas objects
Data is stored as objects within a bucket
Objects can contain any type of data (generally, object == file)
Objects have metadata that can be enriched
In Allas, you can have 500 000 objects per bucket
There is only one level of hierarchy of buckets (no buckets within buckets)
There is no hierarchical directory structure, although it sometimes looks like that
Allas supports two protocols
S3 (used by
s3cmd
,rclone
,a-tools
)Swift (used by
swift
,rclone
,a-tools
,cyberduck
)Authentication and file handling is different for the protocols
Avoid cross-using Swift and S3-based objects!
Allas clients
Puhti, Mahti, Linux servers, Mac:
rclone
,swift
,s3cmd
,a-tools
Laptops (Windows, Mac):
Virtual machines, small servers:
In addition to the tools above, you can use FUSE-based virtual mounts
Allas -- first steps
Use MyCSC to apply for Allas access for your project -- Allas is not automatically available
In Puhti/Mahti, setup connection to Allas using the commands:
Study the manual and start using Allas with
rclone
ora-tools
This course includes also hands-on tutorials and a tutorial video about Allas
Allas -- rclone
Straightforward power-user tool with a wide range of features
Fast and efficient
Available for Linux, Mac and Windows
Overwrites and removes data without asking!
Use with care:
rclone
instructions at Docs CSC
Allas -- a-tools
a-tools
provide an easy and safe way to use Allas for occasional Allas usersDefault bucket names are based on directories on Puhti/Mahti
Unlike
rclone
,a-tools
does not overwrite or remove data without asking!Developed for the CSC supercomputers, but you can install the tools in other Linux and Mac machines as well
Automatic packing (compression can be enabled as well if needed)
Issues with Allas
8-hour connection limit with
swift
No way to check quota
Moving data inside Allas is not possible (
swift
)No way to freeze data
Use two projects if you need to prevent others from editing your data
Different interfaces may work in different ways
Questions that users should consider
Should I store each file as a separate object, or should I collect them into bigger chunks?
In general: consider how you use the data
Should I use compression?
Who can use the data: projects and access rights?
What will happen to my data later on?
How to keep track of all the data I have in Allas?
Bonus: Sensitive data services
CSC Sensitive Data Services for processing sensitive data
SD Desktop is a secure virtual desktop
Controlled access
Data importing only through the SD Connect service
Isolation from the Internet
No direct data export
Allas can be used for sensitive data, but only if the data is properly encrypted!
The SD Connect procedure does the encryption
Bonus: Fairdata services
https://www.fairdata.fi -- Services to manage scientific data according to FAIR principles
Suitable for all static digital research material and related metadata
Free of charge for users in Finnish higher education institutions and research institutes
IDA: storage for research data
Qvain: Describe your dataset and get a persistent indentifier for it
Etsin: Discover datasets based on metadata