Visual Intelligence Seminar: Towards Robust and Reliable Computer Vision Models

Abstract

Over the past years, we have seen tremendous progress in machine learning techniques, partly due to an enormous growth in data resources, increased compute power and advances in methodology. Specifically, in computer vision tasks such as object detection, semantic segmentation or optical flow estimation, recent trends in Deep Learning have shown great success. As a result, highly precise analyses of image data can be provided in practice, leading to practically relevant results even for complex tasks such as multiple person tracking, motion estimation or action recognition.

Despite the strong progress, there are several issues with current approaches. Not only do deep neural network based models rely on large amounts of annotated training data, but they also require the task to be explicitly defined at a fine-grained level and the learning architecture to be optimized specifically for this task. The resulting model usually has a very limited explainability and a low level of generalizability and robustness against domain shifts or adversarial examples. Yet, for many use-cases, these properties are crucial to allow for practical applicability. For example, vision systems for autonomous driving need to work reliably even under adverse weather conditions and the uncertainty of predictions should be reflected in the network’s confidence.

In this presentation, I will show samples from our recent work towards understanding the lack of robustness in current computer vision models and derive architecture modifications and training procedures that can improve the model’s generalization ability and robustness under adversarial attacks and domain shifts. We therefore leverage the signal processing properties of modern convolutional network architectures, and show that reducing systematic construction flaws in networks can improve their practical behavior. We further discuss novel training methods such as to explicitly train for more reliable behavior in practice.