Accurate, scalable and parallel structure from motion

HKUST Electronic Theses

Accurate, scalable and parallel structure from motion

by Siyu Zhu

THESIS 2017

Ph.D. Computer Science and Engineering

xvii, 111 pages : illustrations ; 30 cm

Abstract

Structure from motion is a photogrammetric technique for estimating 3D structures from 2D images and large-scale Structure from Motion is still challenging in three aspects, namely accuracy, scalability and efficiency. The target of this work is to handle highly accurate and consistent large-scale Structure from Motion problems in a parallel and scalable manner.

First, we propose a scalable distributed formulation to handle Structure from Motion problems far exceeding the memory of a single computer in parallel. Different from the previous methods which drastically simplify the parameters of Structure from Motion, we propose a camera clustering algorithm to divide a large Structure from Motion problem into smaller sub-problems in terms of camera clusters with overlapping while preserving as many connectivity among cameras and tracks as possible. We next exploit a hybrid formulation using the relative motions from local incremental Structure from Motion into a global motion averaging framework to produce superior accurate and consistent initial camera poses. Our scalable formulation in terms of camera clusters is highly applicable to the whole Structure from Motion pipeline including track generation, local Structure from Motion, 3D point triangulation and bundle adjustment.

To achieve scalable distributed motion averaging, we base on the scalable formulation in terms of camera clusters to decouple the full motion averaging problem into several sub-problems with respect to their local coordinate frames encoded by similarity transformations for independent optimization in parallel. Then, we can merge sub-problems globally without caching the whole reconstruction in memory at once. The proposed distributed and robust framework supplements the majority of the state-of-the-art motion averaging approaches with superior improvement in efficiency and robustness.

As for large-scale bundle adjustment, eliminating statistical redundancy in multi-view geometry is of great importance to efficient 3D reconstruction. Our approach takes the full set of images with initial calibration and recovered sparse 3D points as inputs, and obtains a subset of views that preserve the final reconstruction accuracy and completeness well. Moreover, global bundle adjustment usually converges to a non-zero residual and produces sub-optimal camera poses for local areas, which leads to loss of details for high-resolution reconstruction. Instead of trying harder to optimize everything globally, we argue that we should live with the non-zero residual and adapt the camera poses to local areas. To this end, we propose a segment-based approach to readjust the camera poses locally and improve the reconstruction for fine geometry details. This significantly reduces severe propagated errors and estimation biases caused by the initial global adjustment.

Together these techniques enable the first pipeline able to reconstruct highly accurate and consistent camera poses from more than one million high-resolution images in parallel with the state-of-the-art accuracy and robustness evaluated on both the benchmark, Internet and challenging city-scale data-sets.

[ Hide abstract ]

View Copyrighted to the author. Reproduction is prohibited without the author’s prior written consent.

Details

Collection HKUST Electronic Theses Degree Ph.D. Department Computer Science and Engineering Supervisors Quan, Long Authors Zhu, Siyu Subjects Image processing Digital techniques Photogrammetry Computer vision Language English Call number Thesis CSED 2017 Zhu DOI 10.14711/thesis-991012532269103412

Full record

Accurate, scalable and parallel structure from motion

by Siyu Zhu

Post a Comment Cancel reply