THESIS
2018
xiv, 104 pages : color illustrations ; 30 cm
Abstract
The availability of large-scale visual data is increasingly inspiring sophisticated algorithms to
process, understand and augment these resources. Particularly, with the rapid advancement
of the latest data-driven techniques, researchers have demonstrated exciting progress in a wide
range of visual synthesis applications, drawing closer to the day that high-quality visual creation
techniques are accessible to non-expert users. However, due to the lack of specific domain
knowledge, the variety of target subjects, and the complexity of human perception, a majority
of visual synthesis problems still remain challenging. In this dissertation, we focus on the
algorithms for synthesizing both image color effects and video motion behaviors, to help create
context-consistent and photorea...[
Read more ]
The availability of large-scale visual data is increasingly inspiring sophisticated algorithms to
process, understand and augment these resources. Particularly, with the rapid advancement
of the latest data-driven techniques, researchers have demonstrated exciting progress in a wide
range of visual synthesis applications, drawing closer to the day that high-quality visual creation
techniques are accessible to non-expert users. However, due to the lack of specific domain
knowledge, the variety of target subjects, and the complexity of human perception, a majority
of visual synthesis problems still remain challenging. In this dissertation, we focus on the
algorithms for synthesizing both image color effects and video motion behaviors, to help create
context-consistent and photorealistic visual content by leveraging the presence of large-scale
visual data.
First, we propose an image algorithm to transfer photo color style from one image to another
based on semantically meaningful dense correspondence. To achieve accurate color transfer
results that respect the semantic relationship between image content, our algorithm leverages
the features learned by a deep neural network to build the dense correspondence. Meanwhile,
it optimizes local linear color models to enforce both local and global consistency. Semantic
matching and color models are jointly optimized in a coarse-to-fine manner. This approach
is further extended from ”one-to-one” to ”one-to-many” color transfer to boost the matching
reliability by introducing more reference candidates.
However, for exemplar-based color synthesis applications including color transfer and colorization,
it is still challenging to handle image pairs involving unrelated content. The above
”one-to-many” method is not a practical solution. Therefore, we take advantage of deep neural
networks to better predict consistent chrominance across the whole image, including those mismatching
elements, to achieve robust single-reference image colorization. Specifically, rather
than using handcrafted rules as in traditional exemplar-based methods, we design an end-to-end
colorization network which learns how to select, propagate, and predict colors from the
large-scale dataset. This network generalizes well even when using reference images that are
unrelated to the input grayscale image.
Finally, besides synthesizing static images, we also explore video synthesis techniques by
processing large-scale captures and manipulating their dynamism. We present an approach to
create wide-angle, high-resolution looping panoramic videos. Starting with hundreds of registered
videos acquired on a robotic mount, we formulate a combinatorial optimization to determine
for each output pixel the source video and looping parameters that jointly maximize
spatiotemporal consistency. Optimizing such large-sized video data is challenging. We accelerate
the optimization by reducing the set of source labels using a graph-coloring scheme,
parallelizing the computation and implementing it out-of-core. These techniques are combined
to create gigapixel-sized looping panoramas.
Post a Comment