Managing Huge Data Sets

With the constantly increasing performance of modern processors and network technologies, the size of processed data sets rapidly grows.

Fraunhofer Parallel File System - BeeGFS

BeeGFS (also known as Fraunhofer Parallel Filesystem) is a parallel cluster file system, developed with a strong focus on performance and designed for very easy installation and management. With the constantly increasing performance of modern processors and network technologies, the size of processed data sets rapidly grows.

In order to handle this huge amount of data and to deliver it to the computing cores as fast as possible, the CC HPC has been working on the parallel file system BeeGFS for several years now. With this file system, the individual files are distributed accross multiple servers chunk by chunk and, in doing so, can be read or written in parallel.

By increasing the number of servers and disks in the system, users can simply scale performance and capacity of the file system to the level that they need, seamlessly from small clusters up to enterprise-class systems with thousands of nodes.

Therefore, BeeGFS is being used on diverse computer clusters, ranging from installations with only a few machines to several systems of the Top500 of the world's fastest supercomputers. Furthermore, the file system is a fundamental component of lots of research projects led by different research organizations and governmental institutions.

BeeGFS is open source software and free of charge. Commercial support is optionally available.

European HPC Projects for Exascale Computing

BeeGFS is an important part of three European HPC projects, which develop specialized computer architectures for the exa-scale regime: DEEP-ER, EXANODE and EXANEST.

For the DEEP-ER project, which proposes a cluster-booster architecture, we have extended our parallel file system BeeGFS, to use the different hierarchy levels of the storage systems efficiently. The EXANODE and EXANEST projects, which have just started, will use energy-efficient processors and nanotechnologies. The projects will use a system-wide, uniform memory concept. Many of the presented ideas will be used for the first time in high performance computing and it is exciting to observe, how these ideas will influence the research direction of the HPC.

To incorporate our extensive knowledge in the European strategic research agenda, we collaborate in the scope of the EXDCI project with leading HPC experts and contribute significantly to the decisions for the scientific program of the EC. Exciting times in the race to the first exa-scale computer cluster are imminent.

Example Projects

ExaNeSt

As our world turns more complex every day, the requirements concerning the next generation of supercomputers demand IT-solutions that are capable of high volume calculations. So-called Exascale computers have the capability to transform our understanding of the world through advanced simulation and problem solving. They can help to move, process and manage unprecedented volumes of data concerning many areas of our lives including climate change, drug design, energy safety, national security, material science, medicine.

The European Consortium, funded by the Horizon 2020 initiative of the EU, plays a substantial role in the development process of the new supercomputer. It consists of twelve partners, each of which knowledge in a technology needed for innovation to reach Exascale.

DEEP and DEEP-ER

In the predecessor DEEP project, an innovative architecture for heterogeneous HPC systems has been developed based on the combination of a standard HPC Cluster and a tightly connected HPC Booster built of many-core processors.

DEEP-ER now evolves this architecture to address two significant Exascale computing challenges: highly scalable and efficient parallel I/O and system resiliency. Co-Design is the key to tackle these challenges – through thoroughly integrated development of new hardware and software components, fine-tuned with actual HPC applications in mind.

Based on BeeGFS we adapt the parallel I/O to the DEEP-ER architecture.