# GPFS File Systems in the Jülich Environment

All user-accessable file systems on the supercomputer systems (e.g. JUWELS, JURECA), the community cluster systems (e.g. JUAMS, JUZEA-1, etc), and the Data Access System (JUDAC) are provided via Multi-Cluster GPFS from the HPC-fileserver JUST.

The storage locations assigned to each user in the system environment are encapsulated with the help of shell environment variables (see table). The user's directory in each file system is shared for all systems the user has granted access. It is recommended to organize the data by system architecture specific subdirectories.

The following file systems are available (Login Node or Compute Nodes)

File SystemUsable Space DescriptionBackupHPC system Access
$HOME2.8 TB Full path to the user's home directory inside GPFS - for personal data (ssh key, ...) TSM (to tape)Login + Compute$SCRATCH9.1 PB

Full path to the compute project's standard scratch directory inside GPFS

- temporary storage location for applications with large size and I/O demands
- data are automatically deleted (files after 90 days by modification and access date, empty directories after 3 days)

$PROJECT2.3 PB Full path to the compute project's standard directory inside GPFS - for source code, binaries, libraries and applications TSM (to tape)Login + Compute$FASTDATA9.1 PB

Full path to limited available data project directory inside GPFS

- storage for large projects in collaboration with JSC

- sufficient reasoning required

$DATA14 PB Full path to the data project's standard directory inside GPFS - large capacity for storing and sharing data; snapshotsLogin$ARCHIVE1.9 PB

Full path to data project's archive directory inside GPFS

- storage for all files not in use for a longer time
- data are migrated to tape storage by TSM-HSM

All variables will be set during the login process by /etc/profile. It is highly recommended to access files always with the help of these variables.

Details about the different file systems can be found in
What file system to use for different data?

Details on naming conventions and access right rules for FZJ file systems are given in
HPC Data Rules for GPFS.

File system resources will be controlled by quota policy for each group/project. For more information see
What data quotas do exist and how to list usage?

### Good practice notes (use tar for lot of small files, don't rename directories)

• Avoid a lot of small files
Numerousness small files should be reorganized within tar-archives to avoid long access times due to deficiencies in file processing of the underlying operating system.

• Avoid renaming of directories
Within all file systems offering a backup (excluding \$SCRATCH), a rename of directories within the data path should be done carefully because all data beyond the changed directory must be backed up once again. If a large amount of data is affected, it prevents backup of really new data in the entire file system and/or costs precious system resources like CPU time and storage capacity.