Data Analysis and Machine Learning

In recent years, Data Analysis and Machine Learning  (ML) methods have evolved to become one of the most dynamic research areas with great impact on our current and future everyday life. Astonishing progress has been made in the application of ML algorithms to areas like speech-recognition, automatic image-analysis and scene understanding. Machine Learning enables computers to drive cars autonomously or to learn how to play video games, pushing the frontier towards abilities that have been exclusive to humans.

This rapid development is accompanied by a constantly increasing complexity of the underlying models. However, there are still significant hurdles to overcome on the way to the everyday use of many existing approaches to machine learning: One of them is the enormous computing power requirement of machine learning. For example, it is currently not unusual for a single learning process with current methods (keyword Deep Learning) to require several days of computing time.

In the area of data analysis and machine learning, we work on new algorithms for efficient distributed computation of learning algorithms and their implementation on specialized hardware. The focus of our work is on the development of scalable optimization algorithms for the distributed parallelization of large machine learning problems. The basis for this work are the HPC components developed at the CC HPC, such as the parallel file system BeeGFS or the programming framework GPI 2.0, which are the first to enable the efficient implementation of new algorithms such as ASGD (Asynchronous Stochastic Gradient Descent).

 

 

Machine Learning and High Performance Computing

Example Projects and Services

 

Next Generation Computing

Digitization is bringing with it a flood of data that we will soon no longer be able to handle efficiently with today's computer systems. It is time for a new hybrid computing generation: Next Generation Computing (NGC). Fraunhofer brought the first quantum computer to Germany in November. We are in the process of researching which problems we will solve better with quantum computers in the future and which will be better solved with other architectures.

 

GAIA-X 4 KI

In the BMKW project »GAIA-X 4 KI«, we are working with 14 partners to develop an ecosystem of data and services that enable the training and validation of artificial intelligence (AI) applications.

 

Tarantella

In the BMBF project »High Performance Deep Learning Framework« we provide easy access to high-performance computing systems through our framework Tarantella.

 

Carme

With the open source multi-user software stack Carme, several users can manage the available resources of a computing cluster.

 

DeTol

In the BMBF project »Deep Topology Learning« (DeTol), data-driven design algorithms are used to accelerate and simplify the design process for deep learning solutions.

 

TensorQuant

With our software tool TensorQuant, developers can now simulate Deep Learning models and thus significantly accelerate the development.

 

HALF

In the HALF project, we are developing energy-efficient hardware that enables artificial intelligence to evaluate patient data on mobile devices.

 

DLSeis

The project »Deep Learning for Large Seismic Applications« (DLseis) deals with basic research up to ready-to-use deep learning tools for seismic applications.

 

Multi-Target Neural Architecture Optimization

AI-Services: NASE – Neural Architecture Search Engine

We support you in designing and integrating your optimal, individual neural network.

 

Fed-DART – Distributed Analytics Runtime for federated Learning

»Distributed Analytics Runtime for federated Learning« enables decentralized machine learning that ensures data privacy.

 

Microparticles With a Big Impact: Aerosols in Climate Models

In this project work, Machine Learning assists in making global long-term predictions of the climate system.

Data Analysis

Example Projects

 

SafeClouds

Further information about the project »SafeClouds« on our project page »Distributed Infrastructure for Data Analysis in Aviation«.

 

Fraunhofer Cluster of Excellence CIT

Cognitive Internet Technologies

The cluster focuses on the three fields »IOT-COMMs«, »Fraunhofer Data Spaces« und »Machine Learning«.

Selected Publications

  • Y. Yang, Y. Yuan, A. Chatzimichailidis, R. JG van Sloun, L. Lei, S. Chatzinotas. »ProxSGD: Training Structured Neural Networks under Regularization and Constraints.« International Conference on Learning Representation (ICLR), Apr. 2020. PDF
  • Raju Ram, Sabine Müller, Franz-Josef Pfreundt, Nicolas R. Gauger, Janis Keuper. »Scalable Hyperparameter Optimization with Lazy Gaussian Processes.« Super Computing 2019, International Conference for High Performance Computing, Networking, Storage and Analysis, Workshop on Machine Learning in HPC Environments
  • Valentin Tschannen, Norman Ettrich, Matthias Delescluse and Janis Keuper. »Detection of point scatterers using diffraction imaging and deep learning.« Geophysical Prospecting (2019).
  • Durall, Ricard, Margret Keuper, Franz-Josef Pfreundt, and Janis Keuper. »Unmasking DeepFakes with simple Features.« arXiv preprint arXiv:1911.00686 (2019).
  • Durall, Ricard, Franz-Josef Pfreundt, and Janis Keuper. »Semi Few-Shot Attribute Translation.« IVCNZ 2019 and arXiv preprint arXiv:1910.03240 (2019). PDF
  • Chatzimichailidis, Avraam, et al. »GradVis: Visualization and Second Order Analysis of Optimization Surfaces during the Training of Deep Neural Networks.« Super Computing 2019, International Conference for High Performance Computing, Networking, Storage and Analysis, Workshop on Machine Learning in HPC Environments.
  • Durall, Ricard, et al. »Object Segmentation Using Pixel-Wise Adversarial Loss.« German Conference on Pattern Recognition. Springer, Cham, 2019. PDF