Deep pixel correspondence learning in images and videos

HKUST Electronic Theses

Deep pixel correspondence learning in images and videos

by Xiaoyu Li

THESIS 2021

Ph.D. Electronic and Computer Engineering

1 online resource (xvii, 129 pages) : illustrations (some color)

Abstract

Finding pixel correspondences involves matching pixels of one image to those of a counterpart image. It is one of the fundamental problems for both computer vision and graphics communities and has a variety of applications in images and videos such as image morphing, image stitching, frame interpolation, etc. The classical approaches for finding correspondences rely on manually crafted feature descriptors and matching strategies. Although impressive results have been achieved, estimating correspondences under complicated applications is still challenging. Recently, deep learning methods have taken center stage as data scale and computing power increase dramatically due to the growth of computer hardware and software infrastructure. This technique brings new vitality and provides a new idea for the solution of current problems. However, these methods usually assume the input images to be matched are regular available images, which is not true for many applications. Therefore, in this dissertation, we explore how these pixel correspondences are established under more challenging scenarios to solve the practical problems in images and videos.

We first propose a method for image distortion correction that employs convolutional neural networks to predict the correspondences between distorted input images and corrected output images. Such a framework can potentially provide solutions to different types of geometric distortions. Our first work focuses on the distortion with a global transformation model to regularize the pixel correspondences. To solve more complex spatially varying distortions like document image deformation, we propose a patch-based approach followed by stitching and illumination correction that can significantly improve the overall accuracy in both the synthetic and real datasets.

In addition to solving the problem based on pixel correspondences in images, we also explore inter-frame correspondences in videos. We utilize the cross-domain correspondences to solve the cartoon video inbetweening problem by fetching the color information from two input keyframes while following the animated motion guided by a user sketch. Another direction we carry out in videos is to find the pixel correspondences under degraded frames for video restoration problems. We propose a flow completion method to restore the correspondences before using it. Benefited from this module, our approach is able to restore the dirt lens videos with various contaminants.

[ Hide abstract ]

View Copyrighted to the author. Reproduction is prohibited without the author’s prior written consent.

Details

Collection HKUST Electronic Theses Degree Ph.D. Department Electronic and Computer Engineering Supervisors Sander, Pedro V. Authors Li, Xiaoyu Subjects Image processing Digital techniques Video compression Digital video Computer vision Language English Call number Thesis ECE 2021 Li DOI 10.14711/thesis-991012986098203412

Full record

Deep pixel correspondence learning in images and videos

by Xiaoyu Li

Post a Comment Cancel reply