Netzwerke
© freepik
Netzwerke

Reveal Networks

In the BMBF project »Criminal Networks«, we are researching the traceable identification of anomalies in fraud networks using AI. The Graph Neural Networks (GNN) method is intended to uncover connections and networks more quickly.

Criminal Networks: Combating Billing Fraud and Corruption in the Healthcare Sector

Project Funded by the Federal Ministry of Education and Research (German BMBF)

Corruption and billing fraud in the German healthcare system cause annual damage of around 14 billion euros for the community of solidarity. In order to uncover fraudulent networks, mass data such as e-mail or telephone traffic must be examined in cases of suspicion. This makes investigations resource-intensive and slow.

Our project »Criminal Networks: Combating Billing Fraud and Corruption in Healthcare« is researching the traceable identification of anomalies in fraud networks using artificial intelligence (AI).

The project is part of the initiative »Artificial Intelligence in Civil Security Research« of the German Federal Ministry of Education and Research (BMBF). In this project, we are developing a so-called »weak artificial intelligence« for investigative authorities and health insurance companies that combines algorithms for detecting anomalies in fraud networks with the domain knowledge of the users.

»Weak« is the name of the AI because the goal here is not to replace human intelligence, but to support it with learning algorithms. A weak AI is only capable of performing concrete tasks whose solution it has learned before. Coupled with expertise of the investigating persons, the hybrid approach helps to facilitate the research.

Machine Learning for Graph Theory and Time Series Analysis

The available data from fraud networks is ideally suited for the use of machine learning (ML). Using graph theory and time series analysis, the AI learns to detect anomalies in a communication network (e.g., mail traffic of a company). In this way, we supplement classic investigative procedures with modern mathematical methods.

Anomalies in billing can be, for example, that patient visits or services are billed that did not take place at all..

What Does That Mean Exactly?

The algorithms of the software tools are designed on networks of nodes. For example, in the project, persons in an investigation case are represented as nodes and e-mail traffic between individual persons is represented as edges. A relatively new approach combining graph theory and machine learning are so-called Graph Neural Networks (GNN). They are already being used in search engines or social media. This is because they are particularly well suited to finding relevant connections or relationships. The GNN approaches are based on artificial neural networks that emulate the human brain. In a GNN, nodes collect information from their neighboring nodes, for example, the frequency of emails about a certain keyword between two people. This is how the GNN learns.

Time series analysis methods, on the other hand, look at changes over time. For example, they can detect seasonality. Thus, an unexpected increase in mails between two people can be detected.

Our Expertise and Experience

Our »Financial Mathematics« department has already been able to gain experience in the detection of anomalies in a wide variety of projects. The spectrum ranges from software tools for administration to algorithms for the verification of trade invoices for a large German chemical company to a collaboration in the area of invoice verification with a German automobile manufacturer that has been established for two years. All projects involve an interdisciplinary cooperation between mathematicians and computer scientists of the Fraunhofer ITWM and the employees of the partners, who have the corresponding domain knowledge. Applied methods are among others

  • rule-based approach
  • deterministic-statistical algorithms
  • machine learning methods

In addition, our department has already conducted research in the areas of time series analysis, cluster analysis, and imbalanced learning, among others.

 

Duration of the Project:

The project started on 01.06.2021 and is scheduled for three years.