THESIS
2016
xxi, 1, 123 pages : illustrations ; 30 cm
Abstract
The application and use of multimedia signals such as image, video, and sound
have increased immensely in daily-life. These visual signals are contaminated with
several varieties of distortions during the acquisition, compression, transmission
and/or display of the signal on screens. The human vision is the ultimate receiver
of these multimedia signals. Consequently, visual perception based Image Quality
Assessment (IQA) and Just Noticeable Difference (JND) have become important
as they can predict the signal quality and highlight the regions of importance
which are compatible with human vision. With this view, in this thesis, we
investigated how human vision perceives information and model human vision.
The human vision models are used for the image quality assessment and just...[
Read more ]
The application and use of multimedia signals such as image, video, and sound
have increased immensely in daily-life. These visual signals are contaminated with
several varieties of distortions during the acquisition, compression, transmission
and/or display of the signal on screens. The human vision is the ultimate receiver
of these multimedia signals. Consequently, visual perception based Image Quality
Assessment (IQA) and Just Noticeable Difference (JND) have become important
as they can predict the signal quality and highlight the regions of importance
which are compatible with human vision. With this view, in this thesis, we
investigated how human vision perceives information and model human vision.
The human vision models are used for the image quality assessment and just
noticeable difference estimation. The main contributions of this thesis are to
propose visual perception based three different IQA and JND algorithms, and to
explore an application of perceptual IQA metrics in generic image reconstruction.
The four works are summarized as below.
The first work focuses on the estimation of Just Noticeable Difference for natural
images. Contrast Sensitivity (CS), Luminance Adaptation (LA) and Contrast
Masking (CM) are important contributing factors for JND in images. Most of
the existing pixel domain JND algorithms are based only on LA and CM and, do
not have the capability of incorporating CS during the JND estimation. Research
shows that the human vision depends significantly on CS, and an underlying assumption
in the existing algorithms is that CS cannot be estimated in the pixel
domain JND algorithms. However, in the case of natural images, this assumption is not true. Recent studies on human vision suggest that CS can be estimated
via the Root Mean Square (RMS) contrast in the pixel domain. With this perspective,
we propose the first pixel-based JND algorithm that includes a very
important component of the human vision, namely CS by measuring RMS contrast.
We also proposed a feed-back mechanism to alleviate the under- and over-estimation
of contrast masking. This feed-back mechanism is based on the relationship
between CS and RMS contrast. Experiments validate that the proposed
JND algorithm efficiently matches with human perception and produces significantly better results when compared to existing pixel domain JND algorithms.
In our second work, we propose to use visual-perception-based IQA metrics
for the purpose of reconstruction. Most of the image reconstruction algorithms
proposed in the previous literature are application-specific and have generalization
issues due to the necessity for parameter tuning and an unknown level of
distortion to the signal. To address this problem, we propose an efficient perceptually
motivated and Maximum-A-Posterior (MAP) based generic framework
for image reconstruction. The proposed algorithm can be applied in wide variety
of applications where the need is to improve edge accuracy or suppress the
visible artifacts. Recent research in IQA area suggests that gradient magnitudes
are generally insensitive to the moderate level of noise, and we propose to utilize
this property to find the pixels with similar edge semantics in the neighborhood.
With this view, we incorporate the gradient-magnitude based IQA matrices into
the MAP formulation to enhance reconstruction accuracy. The proposed generic
algorithm (without the necessity of manually tuning any parameters) is shown
to produce better (and in a few cases, competitive) reconstruction quality when
compared to the state-of-the-art application-specific algorithms for most of the
image processing applications.
The third work discusses the quality assessment of screen content images.
In this work, we address issues associated with free-energy-principle based IQA
algorithms for objectively assessing the quality of Screen Content (SC) images.
The existing IQA algorithms do not give sufficient emphasis to the textual regions
in SC images and assume that these regions do not contribute to the quality of an
SC image. However, this is in contrast to the processing of human vision. Since our eyes are well trained to discern text in daily life, our human vision has prior
information about text regions and can sense small distortions in these regions.
With this view, we propose a new reduced-reference IQA algorithm for SC images
based upon a more perceptual relevant prediction model, which overcomes the
above described problem by giving more emphasis to the textual region. From
experiments, it is validated that the proposed algorithm has a better ability of
efficiently estimating the quality of SC images when compared to the recently
developed reduced-reference IQA and of the full-reference IQA algorithms.
The fourth research work is related to quality assessment of depth image-based
rendering (DIBR)-synthesized views. Free-point video (FVV) are synthesized
via DIBR procedure in the `blind' environment (i.e., without reference images),
and a blind quality evaluation and monitoring system is urgently required. The
FVV images are used in several technologies, such as virtual reality, augmented
reality, and mixed reality. The existing assessment metrics do not render human
judgments faithfully mainly because geometric distortions are generated by
DIBR. To this end, this work proposes a novel referenceless quality metric of
DIBR-synthesized images using the auto-regression based local image descriptor.
It was found that, after the AR prediction, the reconstructed error between a
DIBR-synthesized image and its AR-predicted image can accurately capture the
geometry distortion. Experiments validated the superiority of our no-reference
quality method compared with prevailing full-reference approaches.
Post a Comment