Fraud Detection

According to estimates, the annual damage in Germany based on tax fraud runs into the billions.

Mathematical-statistical techniques are important tools in detecting fraudulent activities, in corroborating suspected cases and in quantifying corresponding risks from damages based on fraud. Furthermore, the minimal amount of loss can be identified for a forensic exploitation to a given statistical security. By means of the “dark field” research it is possible to determine the extent of fraud cases. Hence, appropriate strategies in order to prevent fraud can be developed.

Examples

In Figure 1 some cases (clients, accounts, economic subjects,…) are displayed within a two dimensional coordinate system. The smaller, separated point cloud, which represents the fraud cases, is not visible in separated observations in which the variables are also considered separated. Both projections on the axis typically do not contribute anything to the identification of suspected cases.

It is well known that classical, non-robust cluster algorithms do have issues with outliers, meaning that the brown outlier “M” in the left corner of the figure on the right lets the other red dots be unremarkable as the cluster in red does not separate the red marked cases from the green ones. However, using robust cluster algorithms we get the green marked ellipse as the cluster which separates the red cases and green cases.

Figure 1
© Photo ITWM

Figure 1

Figure 2
© Photo ITWM

Figure 2

Multivariate Analysis Method

Based on the adjustments the ITWM has worked on, further development of multivariate methods as well as due to techniques from Data Mining and Machine Learning conspicuous cases can be identified as in the 3 images.

A common practiced scheme in billing fraud is the multiple assertion of a onetime performed service. Then, it is reasonable to designate bills which highly agree on given criteria. For this purpose we develop suitable metrics.

If it is aimed to select cases according to their fraud risk, the application of modern Scoring-methods present an efficient determination of the testing sequence. Hereby, an arbitrary great amount of quantitative as well as qualitative criteria can be considered.

Multivariate Analysemethoden
© Photo ITWM
Multivariate Analysemethoden
© Photo ITWM
Multivariate Analysemethoden
© Photo ITWM

Competences

Several of the developed tools are based on the statistical software R, where the handling of big databases is also ensured. Furthermore, there is an appropriate handling of a heterogeneous data which can be often found in practice.

Besides a risk based approach, in which we identify suspicious cases with the help of the detection of conspicuity, we determine a resource together with the client which evaluates cases randomly. In this way, we can obtain the “unknown unknowns” and develop the detection methods dynamically. This partition can be done adaptively to the data. Thereby, the meaning of the entire prognosis for the amount of damages in contrast to the relative arrangement of the cases according to fraud risk slightly loses importance.

In every case the domain specific knowledge of our clients will be integrated in our analysis and forms together with our financial mathematical background the basis for our approach.

The presentation of the results always relates to the context and in such a form that allows the client to optimally interpret the results. An important point in the presentation of the results is an appropriate and significant visualization as well as an interactive statement of similar cases. In collaboration with our clients we finally validate and measure the performance of recently developed tools.