Deep Learning Seminar  /  05. Dezember 2019

Deep Learning on HPC Clusters: Preliminary Experiments and What They Teach Us

Abstract:

[nur in Englisch verfügbar]

Distributed learning has become an essential requirement for training large neural networks, in particular in the case of computer vision. Such tasks often rely on huge datasets, which lead to unfeasibly long training times. Additionally, large sample sizes or complex models may not fit on a single machine, thus requiring parallelism to speed up the training process and achieve accurate results.

In this talk, we present benchmarking results for single node and distributed training of typical image classification tasks on HPC clusters, with a focus on local performance tuning and scalability. We will discuss the challenges of properly configuring the software stack to efficiently take advantage of existing deep learning frameworks and libraries. Finally, we will look into data parallelism approaches and show results obtained on our clusters with widely used frameworks, such as Pytorch, Tensorflow, and Horovod.

  • Mechatronics in Mechenical and Automotive Engineering (Dep. of Mechanical and Process Engineering)
  • Automation Control (Dep. of Electrical and Computer Engineering)
  • Electromobility (Dep. of Electrical and Computer Engineering)

The seminar takes place at the ITWM every 1st Tuesday of a month (besides holidays and summer break). Aims are broadening of experiences and exchange of scientific views – also beyond the organizing groups.

Typical subjects of talks are:

  • ongoing or recently finished graduations and doctoral theses
  • current research and projects

The topics vary from mathematical methods to technical implementations. Usually, the talks present research results. However, some show open issues for brainstorming and inputs from the audience.

The seminar »KL-Regelungstechnik« (Kaiserslautern – Control Theory and Control Engineering) is organized by our department as well as several research groups of the TU Kaiserslautern:

  • Technomathematics (Dep. of Mathematics)
  • Mechatronics in Mechenical and Automotive Engineering (Dep. of Mechanical and Process Engineering)
  • Automation Control (Dep. of Electrical and Computer Engineering)
  • Electromobility (Dep. of Electrical and Computer Engineering)

The seminar takes place at the ITWM every 1st Tuesday of a month (besides holidays and summer break). Aims are broadening of experiences and exchange of scientific views – also beyond the organizing groups.

Typical subjects of talks are:

  • ongoing or recently finished graduations and doctoral theses
  • current research and projects

The topics vary from mathematical methods to technical implementations. Usually, the talks present research results. However, some show open issues for brainstorming and inputs from the audience.

The seminar »KL-Regelungstechnik« (Kaiserslautern – Control Theory and Control Engineering) is organized by our department as well as several research groups of the TU Kaiserslautern:

  • Technomathematics (Dep. of Mathematics)
  • Mechatronics in Mechenical and Automotive Engineering (Dep. of Mechanical and Process Engineering)
  • Automation Control (Dep. of Electrical and Computer Engineering)
  • Electromobility (Dep. of Electrical and Computer Engineering)

The seminar takes place at the ITWM every 1st Tuesday of a month (besides holidays and summer break). Aims are broadening of experiences and exchange of scientific views – also beyond the organizing groups.

Typical subjects of talks are:

  • ongoing or recently finished graduations and doctoral theses
  • current research and projects

The topics vary from mathematical methods to technical implementations. Usually, the talks present research results. However, some show open issues for brainstorming and inputs from the audience.