GaspiLS – Scalability for CFD and FEM Simulations

GaspiLS is a scalable and industry proven linear solver library for the exascale age. Many engineering simulations are based on CFD and FEM methods. Examples are the determination of aerodynamic properties of planes or the analysis of statics of buildings. A large part of the computation time is required to solve the underlying equations using iterative methods. The performance of the employed iterative solvers has a significant impact on the total run time of these simulations. We have developed the linear solver library GaspiLS to faster gain insights from the simulations.


Industry Uses GaspiLS for Better Scalability

Scalability measures the parallel efficiency of an implementation. The optimum is the so-called linear scalability. This corresponds to a full utilization of the cores within a single CPU or the CPUs within a cluster, which are interconnected by a network.

GaspiLSs' inherent scalability yields the following advantages for simulations:

  • more detailed models
  • more precise parameter studies
  • cost-efficient resource utilization

These aspects make GaspiLS particularly interesting for industry.

Optimal Use of Given Computing Resources

In order to achieve better scalability, GaspiLS uses tools for parallel programming which are developed by us: These are the communication library GPI-2 and its underlying programming model. The algorithm is split into fine grained sub problems (so-called tasks) with mutual dependencies. This allows for the assignment of executable tasks to free compute resources at any time and guarantees for a continuous stream of compute tasks for every CPU.

In this way, we avoid global synchronization points and compensate for the latency times or imbalances in compute time resulting from the exchange of data due to the large number of generated sub problems. Every single core is maximally emploited at any time.

Pressure correction computation
© Fraunhofer ITWM
Pressure correction computation in the PISO method: GaspiLS (green and blue) has significantly improved performance and scalability in comparison to the MPI based implementation (orange).


Preconditioners are designed to improve the convergence of iterative Krylov subspace solvers, i.e. they reduce the number of required iterations or are even a necessary condition to solve a given problem at all. Therefore, beside the absolute performance and scalability of the basic solvers provided by GapiLS, preconditioners are a key component to solve sparse linear systems efficiently.

GaspiLS provides two classes of black-box preconditioners, now. Algebraic Multigrid (AMG) for symmetric positive definite problems as they appear e.g. in CFD applications and Multi-Level Incomplete LU (Multi-Level ILU) factorizations as a generic preconditioner. Both implementations follow a scalable, efficient, hybrid parallel Gaspi-based approach.

AMG, as representative of a multigrid method, has a linear computational complexity, which is optimal. It can be applied to symmetric positive definite problems. In AMG, a hierarchy of operators is directly constructed from the linear system, without explicit knowledge of the underlying geometry or partial differential equation.

Multi-Level ILU, on the other hand, is widely used because of its robustness, accuracy, and usability as a black-box preconditioner for general sparse linear systems. ILU consists of LU based Gaussian elimination combined with dropping. Depending on the chosen input parameters and the amount of available resources, Multi-Level ILU allows for seemless interpolation between full Gaussian elimination on one end (exact inverse, high resource consumption) and Jacobi preconditioning on the other end (low resource consumption, low quality). Our implementation supports row and col based permutation in the factorization which is often needed to solve a linear system with ill-conditioned matrices.

Grafik GaspiLS
© Fraunhofer ITWM
Strong scalability of the AMG setup phase for a 3D Poisson problem discretized on a regular200³ grid using a second order finite difference scheme. The GaspiLS runtime and its contributions are compared to the performance generated by the latest Hypre 2.2.21 implementation using the same parameters (HMIS, Extended+i).

GaspiCxx for Increased Productivity

Within GaspiLS, we have factorized the implementation for the explicit management of communication resources required by the GPI-2 data transfer and used GaspiCxx to supply it to other applications. GaspiCxx defines an easy to use C++ interface. It delivers the full native GPI-2 performance. At the same time, the management of GPI-2 communication resources is fully transparent to the application. This eliminates a large part of the implementation work normally required to develop GPI-2 applications. Development of GPI-2 applications and the exploitation of the advantages – like the good scalability – has never been so easy.