Goals of the Project
- Support the introduction of HPC to a new, large user group right from the start with innovative tools.
- Hide the complexity of the hardware from the users and lead them to a highly scalable and energy-efficient solution.
- Not only make existing HPC methods accessible to new users, but also gain knowledge about the system requirements of a very important HPC application in the future.
To this end, a new software framework is to be developed that automates the highly complex parallelization of the training of large neural networks on heterogeneous computing clusters.
The Software Framework Focuses on
- Scalability and energy efficiency
- High portability
- User transparency
The training of networks designed in existing frameworks should be scaled over a three-digit number of compute nodes without additional user effort.
GPI Space as a Base
The generic parallelization framework GPI-Space developed at our institute, which uses Petri nets for the effective description of data and task parallelism, provides the basis. In the project, the programming model is further developed and generically linked to the leading Deep Learning frameworks (e.g. Caffe and Tensorflow) using a domain-specific compiler.
The work in the project is scheduled for a period of three years (1.11.2017-31.10.2020).