Scalable Parallel Programming

Beginning at the level of the single core performance, the group is developing methods and tools for the optimal and efficient utilization of all available hardware resources up to the level of the complete supercomputer.



We follow a holistic approach for the optimization of software. Our demand is to pair a comprehensive understanding of methods and algorithms and their implementation with the deep knowledge about the underlying architecture and the potential of the tools in order to provide the optimal performance.



GPI-2 is the communication library of first choice when it comes to higly scalable applications. GPI-2 allows truely asynchronous and parallel communication of all threads and achieves optimal overlap of communication by computation. Fast and partially cost-free notifications of remote components and a well defined system state in case of a failure make GPI-2 the world-leading communication library.


Numerical Solver

GaspiLS is a numerical solver library which is completely build on top of the principles of the GPI-2 programming model. As such, it is trimmed to achieve optimal scalability.


GaspiCxx allows for a prompt and easy development of new applications and/or porting of existing applications to GPI-2. For example, using GaspiCxx, a shared memory parallel TD-DG solver for Maxwell’s equations could be extended to a scalable distributed memory implementation within an afternoon.



GPI-Space is abstracting away the complexity of big machines without impacting the efficiency. Based on a generic failure tolerant and scalable distributed runtime system on a dynamic set of resources, a Petri net based workflow engine and a scalable virtual memory layer, GPI-space allows for the development of domain specific development- and runtime-systems.

Example Projects



Part of SAP's software portfolio is the variant configurator, which is used in countless companies all over the world. In our project we developed essential parts of the underlying library.



Further information on the project SafeClouds on our project page »Distributed Infrastructure for Data Analysis in Aviation«.

BMBF Project


The solution of partial differential equations with computer assistance is used in many areas of science to predict the behavior of complex systems. One example is the prediction of abrasion in the human knee joint, where bones, muscles and ligaments interact with each other.

In the HighPerMeshes project, led by the Paderborn Center for Parallel Computing at the University of Paderborn, we are jointly developing simulation methods and the corresponding software to investigate such processes. We contribute our expertise in the development and application of new software tools such as GPI-2.


Project EPEEC

Parallel Programming Environment

The European project EPEEC stands for the development and deployment of a production-ready parallel programming environment. The goal is to transform future exascale supercomputers into manageable platforms for application developers in various fields.


Project EPiGRAM-HS

Heterogeneous Supercomputing

The EU-project, »Exascale Programming Models for Heterogeneous Systems«, funded for three years, extends the programmability of large scalable heterogeneous systems with GPUs, FPGAs, HBM and NVM, develops new concepts and functions and builds them into the HPC programming systems for scalable supercomputers.

Project EuroEXA

Template for Future Exascale System

In the EuroEXA project, we are working with 15 partners on a template for a future exascale system by co-developing and implementing a petascale level prototype with groundbreaking features. To achieve this goal, the project follows an approach that is both cross-technology and innovative in the areas of application/system software.


Our experts transfer the parallel programming API GPI and the parallel file system BeeGFS to this novel computing architecture and equip FPGAs with the seismic imaging application RTM.




European Processor Initiative

The European Processor Initiative (EPI) gets together 23 partners from 10 European countries, with the aim to bring to the market a low power microprocessor.

Concluded Projects

ExaNoDe – Exascale Computing

Together with 13 European partners we are involved in the project ExaNoDe (European Exascale Processor Memory Node Design). This project will investigate, develop, integrate and pilot the building blocks for a highly efficient, highly integrated, multi-way, high-performance, heterogeneous compute element aimed towards Exascale computing. 


We develop the Fraunhofer GASPI/GPI, which is an open-source communication library to be used for communication between computing nodes. The API has been designed with a variety of possible memory spaces in mind.  GASPI/GPI provides configurable memory segments which aim at mapping the hardware configuration and making them available for the application.

-> More on ExaNoDe

INTERWinE – Exascale Modeling and Implementation

This project addresses the problem of programming model design and implementation for the Exascale. The first Exascale computers will be very highly parallel systems, consisting of a hierarchy of architectural levels. To program such systems effectively and portably, programming APIs with efficient and robust implementations must be ready in the appropriate timescale.

We introduce the programming model GASPI and its implementation GPI into the project and evaluate the interoperability requirements with a number of applications such as the Computational Fluid Dynamics Code TAU of DLR (German Aerospace) for aerodynamic simulation.

-> More on INTERWinE

EPiGRAM – Programming Models for Exascale Systems

The EPiGRAM project which concluded 2016 has worked on programming models for exascale systems. Exascale computing power is likely to be reached in the next decade. While the precise system architectures are still evolving, one can safely assume that they will be largely based on deep hierarchies of multicore CPUs with similarly deep memory hierarchies, and likely also supported by accelerators.

Appropriate programming models are needed to allow applications to run efficiently at large scale on these platforms. Message Passing (MPI) has emerged as the de-facto standard for parallel programming on current Petascale machines; but Partitioned Global Address Space (PGAS) languages and libraries are increasingly being considered as alternatives or complements to MPI. These models will likely also play an important role in Exascale systems. However, both approaches have problems that will prevent them reaching Exascale performance.  In the EPiGRAM project, we addressed some of the main limitations of MP and PGAS programming models by: investigating new disruptive concepts and algorithms in Exascale programming models; providing prototypical implementations; and validating our approach with three real-world applications that have the potential for reaching Exascale performance.

The GASPI programming model with its GPI implementation, which is developed and maintained by Fraunhofer ITWM, is a PGAS representative.  EPiGRAM made GASPI and GPI-2 more complete and robust by closing gaps preventing GASPI and GPI-2 to go for Exascale. Within EPiGRAM strong scaling of the RTM application up to 60k cores at SuperMUC have been shown to be almost linear.

EXA2CT – Creation of Exascale Codes

The EXA2CT project which concluded in 2016 brought together experts at the cutting edge of the development of solvers, related algorithmic techniques, and HPC software architects for programming models and communication.

In the scope of the project modular open source proto-applications were developed that demonstrate the algorithms and programming techniques developed in the project, to help boot-strap the creation of genuine Exascale codes. Technologies developed in EXA2CT range from advanced libraries such as ExaShark and GASPI/GPI that help to program massively parallel machines to solver algorithms improved by better overlapping communication and computation and by increasing the arithmetic intensity. All of this is verified on industry relevant prototype applications.

Tom van der Aa (ExaScience Lifelab at imec, Belgium), coordinator of the EXA2CT project states that In the project it could be shown that Fraunhofer ITWM’s programming API GPI-2 can genuinely outperform MPI.