3D imaging based on multiview video plus depth (MVD) : improved representation and seamless rendering

HKUST Electronic Theses

3D imaging based on multiview video plus depth (MVD) : improved representation and seamless rendering

by Wenxiu Sun

THESIS 2014

Ph.D. Electronic and Computer Engineering

xviii, 131 pages : illustrations ; 30 cm

Abstract

Multiview video plus depth (MVD) is currently the most popular and widely accepted photo-realistic representation of a 3D scene. With the advent of consumer-level depth capturing sensors, 3D information such as dense depth maps can now be acquired cost-effectively from multiple viewpoints. A depth map constitutes a projection of the 3D geometry in the scene to a 2D image/video of fixed resolution. In this thesis, we explore advanced algorithms for improved representation and seamless rendering based on MVD.

Firstly, as depth maps need to be denoised and compressed at the encoder for improved representation and efficient network transmission to the decoder, we consider the denoising and compression problems jointly, arguing that doing so will result in a better overall performance than the alternative of solving the two problems separately in two stages. Specifically, we formulate a rate-constrained estimation problem, where given a set of observed noise corrupted depth maps, the most probable (maximum a posteriori (MAP)) 3D surface is sought within a search space of surfaces with representation size no larger than a pre-specified rate constraint.

Secondly, we work on view synthesis which is one of the typical rendering tasks. In view synthesis, we need to render new views of a scene, starting from a number of images taken from given point of views. This is often called Depth-Image-based Rendering (DIBR) when the 3D geometry is explicitly known as depth map. Particularly, for the problem of synthesizing from stereo images, we apply geometry compensation and reliability-based blending to reasonably integrate the stereo views therefore reducing the visual artifacts. For the problem of synthesizing from mono-image, which is more challenging, we present a novel optimization approach named Visto, which uses one image plus one depth to synthesize seamless (natural and visually pleasing) virtual views in nearby viewpoints. Visto addresses common challenges in DIBR including inaccurate depth map, occlusions, disocclusions (or holes), ringing artifacts, unnaturally sharp edges, etc., in an integral manner by formulating the view synthesis problem as a joint optimization of inter-view texture and depth map similarity.

Finally, we address the temporal consistency problem when synthesizing videos. While the virtual images at nearby viewpoints can be synthesized using DIBR algorithms, directly extending such algorithms from images to videos - by synthesizing each frame independently - would not produce a visually pleasing result in general. A very typical problem would be the lack of content consistency across frames. To address this problem, we extend our previous DIBR algorithm, Visto, by regularizing the similarity across consecutive video frames based on a global motion assumption. Additionally, by doing this, the ill-posed view synthesis problem is alleviated as neighboring frames could provide potentially more useful information for each other.

[ Hide abstract ]

View Copyrighted to the author. Reproduction is prohibited without the author’s prior written consent.

Details

Collection HKUST Electronic Theses Degree Ph.D. Department Electronic and Computer Engineering Supervisors Au, Oscar C. Authors Sun, Wenxiu Subjects Three-dimensional imaging Data processing Image processing Digital techniques Language English Call number Thesis ECED 2014 SunW DOI 10.14711/thesis-b1288923

Full record

3D imaging based on multiview video plus depth (MVD) : improved representation and seamless rendering

by Wenxiu Sun

Post a Comment Cancel reply