THESIS
2015
iv leaves, v-xii, 126 pages : illustrations ; 30 cm
Abstract
Digital images are discrete 2D signals that represent the captured 3D scene.
Besides traditional RGB images that record the photometric information of the
scene, depth maps are a special type of grayscale images that record the geometric
information―the distance from the captured 3D surface to the camera plane—of the scene. For binary representation in computers, a continuous-amplitude pixel value in an image is scalar-quantized into a limited number of amplitude levels, so that a high bit-depth (HBD) image becomes low bit-depth (LBD). During
image/video compression, a residual image block after prediction is linearly transformed to a matrix of transform coefficients. For a high compression ratio, the transform coefficients are again scalar-quantized to reduce the signal entropy. Th...[
Read more ]
Digital images are discrete 2D signals that represent the captured 3D scene.
Besides traditional RGB images that record the photometric information of the
scene, depth maps are a special type of grayscale images that record the geometric
information―the distance from the captured 3D surface to the camera plane—of the scene. For binary representation in computers, a continuous-amplitude pixel value in an image is scalar-quantized into a limited number of amplitude levels, so that a high bit-depth (HBD) image becomes low bit-depth (LBD). During
image/video compression, a residual image block after prediction is linearly transformed to a matrix of transform coefficients. For a high compression ratio, the transform coefficients are again scalar-quantized to reduce the signal entropy. The pixel-domain quantization and transform-domain quantization are major sources
of distortions that degrade the image fidelity.
The purpose of this research is to investigate effective and efficient algorithms
to address the problem of reducing quantization distortions (also known as de-quantization)
for modern image representations such as color+depth (RGB+D) images and multiview images. Besides the de-quantization problems, we also
study the problem of saliency modeling in videos, whose purpose is to enhance the accuracy of predicting human’s visual attention when watching the video.
In Chapter 2, we study the image bit-depth enhancement problem where an HBD image is reconstructed from its quantized LBD version. Different from previous works that try to improve the visual quality of images, we explicitly formulate the bit-depth enhancement problem from a minimum mean-square-error (MMSE) perspective. By decomposing the image into AC and DC components,
we propose a new bit-depth enhancement algorithm which achieves a good trade-off between reconstruction quality and computational complexity.
In Chapter 3, we study the HBD image acquisition problem where an HBD image is quantized to LBD and then reconstructed. The task is to design both quantization and reconstruction schemes to reduce the pixel-domain quantization
distortions of the acquired signal. To this end, we propose a new image acquisition framework where the key intuition is that natural images have strong inter-pixel correlations in local regions, thus it is beneficial to reconstruct a pixel value using the quantized values of its neighboring pixels.
In Chapter 4, we study the problem of saliency modeling and present the first work in literature to introduce 3D motion into bottom-up saliency modeling of RGB+D videos. We first propose an efficient 3D motion estimation algorithm which computes a 3D motion vector (3DMV) for each sub-block in the frame. Using the computed 3DMVs, we then derive several saliency-indicating channels,
which are further incorporated into a widely-accepted bottom-up saliency model.
In Chapter 5, we investigate the problem of precision enhancement of 3D surface that is represented by multiview depth maps. The multiview depth maps are distorted by quantization of transform coefficients, so the problem is to reconstruct multiview depth maps from their compressed version. Instead of solving the transform-domain de-quantization problem separately for each view, we propose to reconstruct the multiview depth maps jointly. The output multiview depth maps of our proposed algorithm are valid 3D surface representations and
feasible to quantization constraints, whose distortions are effectively reduced.
Post a Comment