Synthesizing images and videos from large-scale datasets

HKUST Electronic Theses

Synthesizing images and videos from large-scale datasets

by Mingming He

THESIS 2018

Ph.D. Computer Science and Engineering

xiv, 104 pages : color illustrations ; 30 cm

Abstract

The availability of large-scale visual data is increasingly inspiring sophisticated algorithms to process, understand and augment these resources. Particularly, with the rapid advancement of the latest data-driven techniques, researchers have demonstrated exciting progress in a wide range of visual synthesis applications, drawing closer to the day that high-quality visual creation techniques are accessible to non-expert users. However, due to the lack of specific domain knowledge, the variety of target subjects, and the complexity of human perception, a majority of visual synthesis problems still remain challenging. In this dissertation, we focus on the algorithms for synthesizing both image color effects and video motion behaviors, to help create context-consistent and photorealistic visual content by leveraging the presence of large-scale visual data.

First, we propose an image algorithm to transfer photo color style from one image to another based on semantically meaningful dense correspondence. To achieve accurate color transfer results that respect the semantic relationship between image content, our algorithm leverages the features learned by a deep neural network to build the dense correspondence. Meanwhile, it optimizes local linear color models to enforce both local and global consistency. Semantic matching and color models are jointly optimized in a coarse-to-fine manner. This approach is further extended from ”one-to-one” to ”one-to-many” color transfer to boost the matching reliability by introducing more reference candidates.

However, for exemplar-based color synthesis applications including color transfer and colorization, it is still challenging to handle image pairs involving unrelated content. The above ”one-to-many” method is not a practical solution. Therefore, we take advantage of deep neural networks to better predict consistent chrominance across the whole image, including those mismatching elements, to achieve robust single-reference image colorization. Specifically, rather than using handcrafted rules as in traditional exemplar-based methods, we design an end-to-end colorization network which learns how to select, propagate, and predict colors from the large-scale dataset. This network generalizes well even when using reference images that are unrelated to the input grayscale image.

Finally, besides synthesizing static images, we also explore video synthesis techniques by processing large-scale captures and manipulating their dynamism. We present an approach to create wide-angle, high-resolution looping panoramic videos. Starting with hundreds of registered videos acquired on a robotic mount, we formulate a combinatorial optimization to determine for each output pixel the source video and looping parameters that jointly maximize spatiotemporal consistency. Optimizing such large-sized video data is challenging. We accelerate the optimization by reducing the set of source labels using a graph-coloring scheme, parallelizing the computation and implementing it out-of-core. These techniques are combined to create gigapixel-sized looping panoramas.

[ Hide abstract ]

View Copyrighted to the author. Reproduction is prohibited without the author’s prior written consent.

Details

Collection HKUST Electronic Theses Degree Ph.D. Department Computer Science and Engineering Supervisors Sander, Pedro V. Authors He, Mingming Subjects Optical data processing Visualization Data processing Mathematical models Technique Visual analytics Language English Call number Thesis CSED 2018 He DOI 10.14711/thesis-991012671163303412

Full record

Synthesizing images and videos from large-scale datasets

by Mingming He

Post a Comment Cancel reply