Deep Learning Seminar  /  March 04, 2021, 10:00 – 11:00

A Deep-Learning Approach for Occlusion Detection

Speaker: Peter Lorenz (Fraunhofer ITWM, Department High Performance Computing)


We investigate the problem of occlusion detection in stereo images by evaluating three different supervised and end-to-end Convolutional Neural Network (CNN) architectures. Occlusions arise if a scene is capture by two cameras from different positions. Occluded pixels do not have any corresponding pixel in the other image and are therefore erroneous pixels. The detection of occlusions would improve other approaches relying on the corre-sponding problem, such as computing disparity maps or excluding occluded pixels in theloss of a neural network. Each architecture takes rectified stereo image pairs as input. Unary CNNs for the leftand right input images compute the pixel-wise similarity of the image pairs. The three models differ in how they combine the output of the unary CNNs to detect occlusions. One model learns occlusions with dilated convolutional unary CNNs. The other threelearn occlusions with the stereo correlation, where the stereo correlation is refined witheither 2D dilation layers or 3D convolutional layer. We have made experiments with pre-training the models with the synthetic Sceneflow (Monkaa) dataset. We fine-tune our methods to the smaller Middlebury dataset and uses the Adam optimizer to train it. Our proposed models achieve an accuracy score between 77.98 percent and 89.02 percent on the Middlebury validation set.