GaspiLS – Software Solution for CFD and FEM Simulations

Optimal Solver for Systems of Linear Equation

GaspiLS is our scalable linear solver library, which has already proven itself in industrial companies in times of exascale computing. Many simulations in engineering are based on Computational Fluid Dynamics and Finite Element Methods (CFD and FEM methods), for example the determination of aerodynamic properties of aircraft or the analysis of building statics. A large part of the computational time is required for solving the underlying equations by iterative methods such as the Krylov Subspace. The performance of the iterative solvers used thus has a large impact on the overall runtime of such simulations. To gain faster insights from these simulations, we have developed the linear solver library GaspiLS.
 

Industry Relies on Gaspils Due to Better Scalability

Scalability is considered a measure of parallel efficiency in an implementation. The optimum is the so-called linear scalability. This corresponds to the full utilization of computing resources, i.e. the cores within one or more CPUs connected via a network. Improved scalability enables more computing capacity to be used profitably.

In implementation, the following advantages result:

  • more detailed models
  • more accurate parameter studies
  • cost-efficient utilization of hardware resources

These aspects make GaspiLS particularly interesting for industrial companies that use Computer Aided Engineering (CAE) methods in their product development in order to save costs. This includes the manufacturing industry in general, but especially, for example, the automotive industry, mechanical engineering and aerospace engineering.
 

Optimal Use of Given Computing Resources

To achieve good scalability, GaspiLS uses the methods and tools we have developed for parallel programming. These include the GPI-2 communication library and its underlying programming model. The algorithm is divided into subproblems (so-called tasks) with mutual dependencies in a fine-granular way. This allows executable tasks to be allocated to free computing resources at any time, thus guaranteeing a continuous stream of computing tasks for each CPU.

Pressure correction computation
© Fraunhofer ITWM
Pressure correction computation in the PISO method: GaspiLS (green and blue) has significantly improved performance and scalability in comparison to the MPI based implementation (orange).

We do this without global synchronization points and, in combination with the large amount of generated subproblems, compensate for inequalities in computation time and latency caused by the exchange of data. Each individual core is maximally utilized at all times.
 

Optimal Convergence for a Wide Range of Problems

Preconditioners are designed to improve the convergence of iterative Krylov subspace methods. They reduce the number of required iterations or even enable the solution of a given problem in the first place. Preconditioners are therefore an equally central component for GaspiLS, along with the underlying high-performance matrix vector multiplication, to efficiently solve sparse linear systems of equations.

GaspiLS now provides two different classes of black-box preconditioners:

  1. An Algebraic Multigrid Method (AMG) for symmetric positive definite problems, such as those encountered in CFD applications.
  2. A multi-level incomplete LU decomposition (Multi-Level ILU) in combination with different domain decomposition methods for general problems.

Both implementations use a scalable, efficient, and hybrid parallel Gaspi-based programming model.

AMG, as a multigrid method, has linear numerical complexity, i.e., the effort to solve increases only proportionally to the number of unknowns to be solved. This is optimal and is used for symmetric positive definite problems. A hierarchy of operators is constructed directly from the matrix without requiring explicit knowledge of the underlying geometry or the partial differential equation.

Multilevel ILU are robust, accurate and, as black-box preconditioners, optimally suited for general sparse systems of equations. ILU essentially consists of a Gaussian elimination procedure in which less important contributions are discarded. Depending on the parameters used and the resources available, our multi-level ILU implementation can interpolate between a full Gaussian elimination (exact inverse, high resource requirement) and a simple Jacobi preconditioner (low resource requirement, low approximation accuracy). In combination with different overlapping or non-overlapping domain decomposition methods available for parallelization, our implementation provides a comprehensive arsenal of methods to implement different preconditioners optimized for the problem at hand. Thus, even poorly conditioned problems can be solved in an optimal way.

The AMG Setup Phase
© Fraunhofer ITWM
The AMG setup phase runtime required by GaspiLS is compared as a function of the number of processes used with the latest Hypre implementation with the Hopscotch option enabled for optimal hybrid parallel performance.
MultiLevel ILU Iterations
© Fraunhofer ITWM
The number of iterations required to solve a complex-valued Helmholtz problem in 2D, as a function of the number of processes used for different domain decomposition methods in combination with different ILU preconditioners.

GaspiCxx for Increased Productivity

Within GaspiLS the management of communication resources for GPI-2 data exchange has been abstracted and made available for other applications with GaspiCxx. GaspiCxx defines an easy to use C++ interface. The management of GPI-2 communication resources is completely handled by GaspiCxx without any limitation of the underlying performance. The application does not have to care about it anymore. This eliminates much of the implementation work normally required when developing GPI-2 applications. Developing GPI-2 applications and taking advantage of the associated benefits – such as good scalability – has never been easier.