Learning geometric image matching for visual 3D modelling

HKUST Electronic Theses

Learning geometric image matching for visual 3D modelling

by Zixin Luo

THESIS 2020

Ph.D. Computer Science and Engineering

xvii, 117 pages : illustrations ; 30 cm

Abstract

Geometric image matching requires to establish sparse correspondences on 2D points, upon which the camera geometry is recovered and the static scene structure is reconstructed. In essence, the performance of a broad range of computer vision applications is highly dependent on searching strong correspondences, including panorama stitching, visual localization, Structure-from-Motion (SfM), Simultaneous Localization and Mapping (SLAM), Augmented Reality (AR) and 3D reconstruction. During the past decade, hand-crafted keypoint features and engineered feature matchers have been primarily used in general-purpose image matching pipelines as the de-facto standard. Despite their apparent success, the traditional methods are still known to have difficulty in identifying reliable correspondences with large illumination or perspective changes, which as a result, has become the bottleneck for acquiring better spatial understanding in 3D.

With the emerging of deep learning, a great amount of effort has been spent on reformulating each component of image matching through modern neural network architectures, which can be efficiently optimized in a data-driven and differentiable manner. In this thesis, we will first review the recent achievements on learning-based image matching techniques, then reveal the substantial challenges arisen from practical use, and finally elaborate the methods we have proposed that give rise to state-of-the-art results on several important benchmarking datasets.

More specifically, we decompose the learning-based image matching pipeline into four learnable sub-modules. First, a local feature extractor 1) with a keypoint detector and 2) a keypoint descriptor, where we address the accuracy of keypoint localization, the efficiency of training data sampling, the aggregation of contextual information, and the advantage of a joint learning of both detection and description tasks. Next, 3) a specialized image retrieval system for SfM tasks, which shortlists the matching candidates from a large image collection and identifies geometric image overlaps even without clear-defined semantics. Lastly, 4) a feature matcher that rejects outlier correspondences in solving two-view geometric models, and leverages spatial context such as motion consistency from correspondence input.

To facilitate above research, we also present a large-scale dataset that employs an automatic pipeline to generate rich and accurate geometric training labels from well-reconstructed 3D models. The proposed methods have been integrated into several important applications, and in particular evaluated in the context of visual 3D modelling, where drastic improvements and strong generalization ability have been demonstrated.

[ Hide abstract ]

View Copyrighted to the author. Reproduction is prohibited without the author’s prior written consent.

Details

Collection HKUST Electronic Theses Degree Ph.D. Department Computer Science and Engineering Supervisors Quan, Long Authors Luo, Zixin Subjects Computer graphics Mathematical models Image processing Three-dimensional display systems Language English Call number Thesis CSED 2020 Luo DOI 10.14711/thesis-991012818569103412

Full record

Learning geometric image matching for visual 3D modelling

by Zixin Luo

Post a Comment Cancel reply