THESIS
2019
xiii, 109 pages : illustrations ; 30 cm
Abstract
Capturing 3D models of real-world environments from 2D images is a long-standing goal in computer
vision. The relevant 3D reconstruction techniques are becoming increasingly practical and
popular in recent years, thanks to the significant improvement of the computational power, as well
as the rapid development of the capturing devices such as consumer cameras, mobile phones, and
flying drones. Surface reconstruction is one of the core steps in 3D reconstruction, which recovers
the underlying geometry and is crucial to the fidelity of reconstruction. Subsequently, with the
reconstructed 3D model represented as the mesh surface, performing semantic understanding on
meshes is desirable for many applications. In this thesis, we study and contribute to these two
problems.
First, we...[
Read more ]
Capturing 3D models of real-world environments from 2D images is a long-standing goal in computer
vision. The relevant 3D reconstruction techniques are becoming increasingly practical and
popular in recent years, thanks to the significant improvement of the computational power, as well
as the rapid development of the capturing devices such as consumer cameras, mobile phones, and
flying drones. Surface reconstruction is one of the core steps in 3D reconstruction, which recovers
the underlying geometry and is crucial to the fidelity of reconstruction. Subsequently, with the
reconstructed 3D model represented as the mesh surface, performing semantic understanding on
meshes is desirable for many applications. In this thesis, we study and contribute to these two
problems.
First, we present methods regarding high-fidelity surface reconstruction. The term “high-fidelity”
contains a double meaning, namely the topological accuracy and the geometric accuracy.
For one thing, the topological accuracy concerns about the structural correctness and completeness
of the reconstructed surface. For example, thin structures often fail to be retained in the reconstruction
due to incomplete and noisy point clouds. To address this problem, we leverage the spatial
curve representation for thin and elongated structures, and present a novel surface reconstruction
method using both curves and point clouds. Besides, the geometric accuracy measures the holistic
similarity between the reconstructed model to the ground truth model, and can be optimized
by minimizing the reprojection error in surface refinement. Such optimization is iterative and requires
repeated computation of gradients over all surface regions, which is the bottleneck affecting
adversely the computational efficiency of the refinement. Therefore, we present a flexible and efficient
framework for mesh surface refinement in multi-view stereo, dubbed Adaptive Resolution
Control (ARC). The ARC evaluates an optimal trade-off between the geometry accuracy and the
performance via curve analysis, and accelerates the stereo refinement by severalfold by culling
out most insignificant regions, while still maintaining a similar level of geometry details that the
state-of-the-art methods could achieve.
Second, we present methods regarding the semantic understanding of the reconstructed surface.
The mesh surface texture-mapped by images, is a photo-realistic and standalone representation that
renders the reality of objects or scenes. We present a convolutional network architecture for direct
feature learning on mesh surfaces through their atlases of texture maps. Since the parameterization
of the texture map is unpredictable, and depends on the surface topologies, we therefore introduce a
novel cross-atlas convolution to recover the original mesh geodesic neighborhood, so as to achieve
the invariance property to arbitrary parameterization. The proposed module is integrated into classification
and segmentation architectures, which takes the input texture map of a mesh, and infers
the output predictions.
In sum, this thesis provides methods for high-fidelity and efficient surface reconstruction, as
well as the semantic parsing on the reconstructed mesh surface. We are successful to concatenate
these components and create a pipeline for surface reconstruction and the subsequent semantic
parsing in a fully automatic manner.
Post a Comment