THESIS
2021
1 online resource (x, 44 pages) : color illustrations
Abstract
Focusing on internal visual information included in one single image, internal methods have
been widely studied in the computer vision community. Especially as it removes the necessity
for collecting large-scale datasets which is usually accompanied with intensive human labor
for labelling, deep internal learning has came into the limelight recently. However, in terms of
practical usage of deep internal learning, there are still many obstacles to be overcome. For
example, most existing deep internal methods are either (1) image-specific or task-specific, or
(2) requires long training time.
In this thesis, we push the limits of deep internal learning by proposing SinIR, a reconstruction-based
framework trained on a single image for general image manipulation. SinIR is trained on
a single...[
Read more ]
Focusing on internal visual information included in one single image, internal methods have
been widely studied in the computer vision community. Especially as it removes the necessity
for collecting large-scale datasets which is usually accompanied with intensive human labor
for labelling, deep internal learning has came into the limelight recently. However, in terms of
practical usage of deep internal learning, there are still many obstacles to be overcome. For
example, most existing deep internal methods are either (1) image-specific or task-specific, or
(2) requires long training time.
In this thesis, we push the limits of deep internal learning by proposing SinIR, a reconstruction-based
framework trained on a single image for general image manipulation. SinIR is trained on
a single image with cascaded multi-scale learning, where each network at each scale is responsible
for image reconstruction. Having reconstruction as its training objective, SinIR is trained
way faster and robustly. However, naively using reconstruction leads unsatisfactory visual quality
due to its innate characteristics. Thus, to mitigate this problem, we apply random pixel
shuffling, a simple solution to effectively enrich the learning process, inspired by the Denoising
Autoencoder. SinIR solves various computer vision problems including super-resolution,
editing, harmonization, paint-to-image, photo-realistic style transfer, and artistic style transfer.
Quantitative evaluation shows SinIR has competitive performance comparable to those of dedicated
external methods. Also it is found that SinIR is trained 33.5 times faster than SinGAN
(for 500 x 500 images) that solves similar tasks.
Post a Comment