Vorträge unserer Experten
Abstracts [nur in Englisch verfügbar]
Generative Models for Transfer Learning on Seismic Data
(Speaker: Ricard Durall , Valentin Tschannen, Norman Ettrich, Janis Keuper)
Over the last years, Deep Learning models have become increasingly successful in the attempt to automate various pre-processing steps and interpretation tasks in the seismic processing pipeline. Recently, there have been very impressive show cases for tasks like fault and salt detection, horizon tracking or pre-stack filtering. However, most of the approaches introduces so far, are relying on supervised learning algorithms which are requiring large amounts of mostly manually annotated training data. The generation of these data sets is not only very expensive, but also does not scale. In addition, many models that have been trained on a specific dataset (e.g. from a certain location), will not directly generalize to work on other data sets. In this talk, we will summarize our latest work on transfer learning in the seismic domain and show how we can use generative neural networks to combine physical simulations and real data, in order to train models with very little annotations.
Efficient Scheduling of Interactive Python Jobs on HPC Systems
(Speaker: Philipp Reusch, Dominik Strassel, Janis Keuper)
Research and development of deep learning methods, especially for large and high dimensional seismic data, requires vast compute resources. At the same time, machine learning developers have a strong preference towards interactive Python workflows. This is one reason why it is quite challenging to integrate Deep Learning related jobs into classical HPC environments. In this talk we present a solution for a highly efficient scheduling of such jobs. Our approach is based on an available open source stack, allowing very high GPU utilizations while scheduling interactive python jobs alongside classical HPC jobs on a single system.
Tarantella – A framework for distributed Deep Learning
(Speaker: Peter Labus, Alexandra Carpen-Amarie, Martin Kühn)
Much of the recent progress in Deep Learning has been achieved through datasets and deep neural networks (DNNs) growing in size. High-dimensional seismic data in particular demands large amounts of resources, both in terms of training time and memory to benefit from Deep Learning methods. To achieve strong and weak scaling for DNN training, both data and task parallelism have to be employed. In this talk, we will present our distributed Deep Learning framework Tarantella, which provides a high-level, TensorFlow-based interface, and aims at enabling strongly and weakly scalable DNN training on HPC systems. Tarantella is designed to seamlessly execute existing Keras models on HPC systems and pays special attention to reproducibility.