Styx – GPU Cluster at Fraunhofer ITWM

The High Performance GPU cluster »Styx«is unique in its kind at the institute. Instead of upgrading an existing CPU-based cluster with GPUs, we have taken a completely different approach with »Styx«. From the very beginning, our goal was to design a cluster that focuses on the effective use and seamless integration of novel accelerator cards into the user's workflow.

In order to bring this idea to life, we use CARME, an in-house development, to access the cluster.

Access

All employees, PhD students and students at the Fraunhofer ITWM as well as all members of the High Performance Center Simulation and Software Based Innovation (Leistungszentrum Simulations- und Software-basierte Innovation) are entitled to gain access. In addition, there is the possibility to grant access to researchers beyond this circle of persons in the course of projects or research activities involving one of the above mentioned institutions. To apply for an account please get in touch with the persons listed under »Contact«.

Access to the cluster is carried out via a web-based access provided by »CARME«.

Hardware

All in all, our users have more than 100 GPUs and several hundred terabytes of storage at their fingertips. The GPUs are distributed across a total of 32 physical compute nodes, which are connected to the distributed file system provided by BeeGFS fast interconnect (InfiniBand). Additionally, each compute node has a local SSD that can be used during the job runtime.

 

Currently, two different compute node types are available to users.

 

Typ amount GPUs per Node GPU Memory CPU CPU Cores Main Memory
makaria
16 2x NVIDIA GTX-1080ti 11 GB AMD Threadripper 1920x 24 64 GB
cerberus 16 4 x NVIDIA Titan V 12 GB Intel Xeon 4108 32 192 GB
erebos 3 4x NVIDIA A100 40 GB AMD EPYC 7402 96 512 GB

 

Both the access and the use of these resources is determined by a batch system and different queues either project or account specific and can therefore be different for each user.

Software

The software basis of the cluster is »CARME« as already mentioned. The necessary software for projects is made available to users in a combination of singularity containers and anaconda environments. For further information and support on how to install software and which restrictions exist, please get in touch with the persons listed under »Contact«.