Skip to content

File systems and file transfer

Introduction

Warning

There are no backups of our file systems at this point. You have to make sure to backup your data yourself.

Please make sure only to store the data you need for your current work on the SC file systems. The clusters and their file systems are shared resources and all users need to share these resources.

Please make sure to delete your temporary local files you created during a computation, when your job finishes. You may use your job script or an epilog script to implement a clean-up procedure.

Quotas

We are now enforcing quotas on the file systems. You have 100 GB in your /home directory and up to 1 TB in your /work directory with a workspace application. Please make sure to clean up your files and directories regularly.

File systems

On our cluster we have three different types of storage tiers that have varying levels of performance, capacity, and accessibility. These tiers are designed to provide a balance between speed and cost, as well as to accommodate different types of data and workloads.

Home directory

The Home directory ~ or /home/sc.uni-leipzig.de/<user>

This is a user's personal storage space where they can store their files, scripts, and configuration files.
It is accessible from any node on the cluster.

Work directory

The work directory /work is intended to be used with workspaces.

This is a directory where you as a users can store your large data files that are needed for your calculations.
It is accessible from any node on the cluster, which make it a perfect storage space to share large libraries of files that need to be shared with a group of users.

Striping

Our work filesystem is striped over multiple different storage targets, each having a partial capacity of the complete storage. All files are striped over a certain number of these targets. As soon as a file grows above ~200 MB it makes sense to spread it over a higher number of targets to increase write and read performance. Since the storage per target is smaller than the overall capacity, it is absolutely essential to split your really large files so that the chunks can be saved and are not blocking one of our storage targets!

To activate striping (use 4 target stores) for your workspace (e.g. test-my-ws) you can execute the following command .

lfs setstripe -c 4 /work/test-my-ws

Warning

Beware that this a specialized parallel filesystem optimized for large file access! Reading and writing to many small files may not yield the expected performance and may even slow down the system for others users. For many small files please consider splitting them into multiple folders (rule of thumb: amount_of_folders = sqrt(amount_of_files)).

Local scratch directory

The local scratch /lscratch

The local scratch directory on our clusters is a fast storage space located on each individual compute node. It should be used to store temporary data generated during computation and is optimized for fast read and write access. However, it is small in size, not backed up, and should only be used for temporary data. Users are advised to move important data to the shared work directory or home directory.

Accessing the local scratch directory outside of a job

The local scratch directory is located on each compute node and is not shared between nodes. Since SSH access is only possible to the login nodes, you need to use the job scheduler to access the local scratch directory on the compute nodes.

To access the local scratch directory of a specific node, you can use the following command:

salloc -p <partition~name> -w <specific~node>

e.g.

salloc -p paula -w paula11

URZ Home Directory (Samba Share)

To mount your home directory execute the following command:

gio mount "smb://dom;<university~account~name>@domhomefs01.dom.uni-leipzig.de/homes/<first~letter~of~university-account>/<university~account~name>"

You will be asked to enter your university account password. After that the share is mounted at /run/user/<your~id>/gvfs/smb-share:domain=dom,server=domhomefs01.dom.uni-leipzig.de,share=homes,user=<university~account~name>/<first~letter~of~university-account>/<university~account~name>.

You can obtain your id via the command id.

To have an easier access to your data you can create a symbolic link from the mount to a folder (eg. my_urz_home) in your home directory with the command: ln -s /run/user/<your~id>/gvfs/smb-share:domain=dom,server=domhomefs01.dom.uni-leipzig.de,share=homes,user=<university~account~name>/<first~letter~of~university-account>/<university~account~name> ~/my_urz_home

Warning

The mounted directory will only be accessible on the node where you mounted it. If you want to access it on another node you have to mount it again on that node. You can also incorporate the mounting command in your job scripts.

Copy data from and to the SC systems

The easiest way to copy data to these file systems is to use the command line tool secure copy scp.

Warning

Please use the dedicated file copy nodes export01 and export02 to copy large amounts of data to and from the SC file systems. File operations (cp, mv, rsync, rm, scp, wget, curl...) on the login nodes will be automatically terminated if they exceed 10 minutes and forcefully stopped after 20 minutes. We encourage you to use export01 and export02 for all file management tasks to avoid any disruptions on the login nodes.

The following commands assume your are copying from your local workstation to the SC clusters.

To copy a single file to your workspace called myuser-exampleWorkspace use

scp $source_file user@export01.sc.uni-leipzig.de:/work/myuser-exampleWorkspace

Complete directories can be copied recursively by adding the -r switch, e.g.

scp -r $source_dir user@export01.sc.uni-leipzig.de:/work/myuser-exampleWorkspace/

If you want to copy files/directories to your home directory use

scp -r $source_dir user@export01.sc.uni-leipzig.de:~

Tip

Notice here that your home directory can be abbreviated to just a single tilde ~ character.
Let us assume your SC username is ab123defg.

~ would automatically be interpreted as /home/sc.uni-leipzig.de/ab123defg.

If you have your data on a Windows machine you can use winscp.