Fed-DART – Distributed Analytics Runtime for Federated Learning

Decentralized Machine Learning That Ensures Data Protection

Our »Distributed Analytics Runtime for Federated Learning« (Fed-DART) enables the easy implementation of federated machine learning (ML) methods to leverage local data of distributed environments. This makes it possible to train an AI model without having to centralize and merge the data.

Increasing amounts of data are needed to train AI. The more data available for training, the better the results. In the practical implementation of many AI projects, obtaining or providing this data is a significant hurdle. The reasons for this can be manifold:

  1. Individual companies or departments can only acquire a small amount of data or cannot bring it together centrally.
  2. Regulatory restrictions due to data sovereignty and data protection
  3. Data is generated on mobile devices and cannot be combined into a large database due to low communication bandwidth. Training on all data at the same time is thus not possible.

Federated Learning: An Answer for Companies in Terms of AI

The above points pose major challenges for companies to use AI productively.  The possible solution to this is called Federated Learning.

In federated learning, the data remains where it is generated or collected in the first instance. An AI model is trained on this local data. To increase the accuracy of the local models, the learned models are divided globally and aggregated in a suitable way to a global model. State-of-the-art algorithms help to solve aggregation problems due to the diversity of the local data. The global model is shared with all users. Thus, the accuracy of the local models is improved without sharing the data itself.

Fed-DART Diagram
© Fraunhofer ITWM
Federated learning ensures private and secure AI as a decentralized solution: the data remains local to where it was generated; only the AI model is shared and improved globally.

Federated Learning with Fed-DART

With our framework Fed-DART we want to enable the user to easily implement federated learning.  The user can fully focus on developing appropriate AI methods without having to deal with the distributed computing aspect. Here, you benefit from our many years of experience and expertise in the field of distributed computing through three key points:

  1. Flexibility: Fed-DART is device-independent and therefore supports a wide range of application possibilities. This enables federated learning on a few, compute-rich data centers as well as on many edge platforms. Furthermore, the integration of Fed-DART is independent of the machine learning framework used.
  2. Scalability and Reliability: Fed-DART is based on the industry-proven GPI-SPACE. It enables high scalability on a wide range of participating devices and provides a fault-tolerant and dynamic runtime environment.
  3. Ease of use: end users can easily and conveniently integrate Fed-DART into their Python code. This is achieved by separating the algorithms from the technical infrastructure. The end user can therefore fully concentrate on his AI methods.
Fed-DART Diagram
© Fraunhofer ITWM
Fed-DART enables the flexible integration of different machine learning frameworks for the end user. The management of participating parties is fully automated with our industry-proven runtime environment.

Example project »Bauhaus Mobility Lab«

In the context of the Bauhaus Mobility Lab (BML) as a SmartCity Cloud Platform, Fed-DART is used to realize federated learning in an urban environment. For this purpose, several particulate matter measuring stations are connected to a network to enable better prediction. Each monitoring station trains on its local air pollution data and then shares its knowledge with a central entity.  This collected knowledge is aggregated by suitable algorithms and in turn improves the predictions of each individual measuring station. This information gain can be used to improve air quality in cities.