THESIS
2022
1 online resource (xv, 114 pages) : illustrations (some color)
Abstract
Cameras are of vital importance for autonomous robots thanks to their lightweight,
rich information, and low power consumption. During the past decades, many researchers
have focused on how to use camera systems to perceive the geometric structure of environments.
Binocular stereo vision is the most popular solution and is featured with its
simple structure and instantaneous perception capability. Although binocular stereo systems
have been widely used in robotic systems to generate depth maps and further build
3D maps, some particular problems are required to be further explored. In this thesis, we
present a group of methods to improve the performance of stereo-based depth estimation
and map fusion. We firstly propose a method to estimate depth under motion with single
pair rolling shu...[
Read more ]
Cameras are of vital importance for autonomous robots thanks to their lightweight,
rich information, and low power consumption. During the past decades, many researchers
have focused on how to use camera systems to perceive the geometric structure of environments.
Binocular stereo vision is the most popular solution and is featured with its
simple structure and instantaneous perception capability. Although binocular stereo systems
have been widely used in robotic systems to generate depth maps and further build
3D maps, some particular problems are required to be further explored. In this thesis, we
present a group of methods to improve the performance of stereo-based depth estimation
and map fusion. We firstly propose a method to estimate depth under motion with single
pair rolling shutter stereo images. Our method applies a novel cost volume building
method for rolling shutter image pairs, which adapts depth candidates to the change of
baseline lengths for all pixels. With the development of deep learning, we design a convolutional
neural network to decompose the depth and reflectance of scenes from a night
pair. Benefits from the representative abilities of 12-bits raw images, the learned network
produces a smooth depth map at night. After building the experimental platform, we
propose a novel method to calibrate the cameras with inconsistent imaging capabilities.
Since the map fusion requires taking the poses of the camera as input, we also explore
how to utilize the advanced deep learning technologies to enhance the traditional direct
alignment. Furthermore, we propose a learning-based method to estimate scene flows
from two occluded point clouds, which can retain the motion information of all space points in dynamic scenes. Lastly, most traditional local map fusion algorithms delete
inconsistent points (dynamic objects) from the fused map, resulting in the loss of much
useful data. We want to discuss how to construct a 4D map in the chapter of “Conclusion
and Future Work”. The constructed map should contain the fused local map and the
estimated dynamic scene flow information.
Post a Comment