THESIS
2024
1 online resource (1 unnumbered page, xiii, 94 pages) : illustrations (chiefly color)
Abstract
3D perception serves as a cornerstone in the realm of autonomous driving. Vision-based
3D perception methods, which rely solely on camera inputs to reconstruct a 3D
environment, have seen significant advancements due to the proliferation of deep learning
techniques. Despite these strides, existing frameworks still encounter performance bottlenecks
and often necessitate substantial amounts of LiDAR-annotated data, limiting their
practical deployment across diverse autonomous driving platforms at a larger scale.
This dissertation is a multifaceted contribution to the advancement of vision-based
3D perception technologies. In the first segment, the thesis introduces structural enhancements
to both monocular and stereo 3D object detection algorithms. By integrating
ground-referenced geomet...[
Read more ]
3D perception serves as a cornerstone in the realm of autonomous driving. Vision-based
3D perception methods, which rely solely on camera inputs to reconstruct a 3D
environment, have seen significant advancements due to the proliferation of deep learning
techniques. Despite these strides, existing frameworks still encounter performance bottlenecks
and often necessitate substantial amounts of LiDAR-annotated data, limiting their
practical deployment across diverse autonomous driving platforms at a larger scale.
This dissertation is a multifaceted contribution to the advancement of vision-based
3D perception technologies. In the first segment, the thesis introduces structural enhancements
to both monocular and stereo 3D object detection algorithms. By integrating
ground-referenced geometric priors into monocular detection models, this research
achieves unparalleled accuracy in benchmark evaluations for monocular 3D detection.
Concurrently, the work refines stereo 3D detection paradigms by incorporating insights
and inferential structures gleaned from monocular networks, thereby augmenting the operational
efficiency of stereo detection systems.
The second segment is devoted to data-driven strategies and their real-world applications
in 3D vision detection. A novel training regimen is introduced that amalgamates
datasets annotated with either 2D or 3D labels. This approach not only augments the
detection models through the utilization of a substantially expanded dataset but also facilitates
economical model deployment in real-world scenarios where only 2D annotations are readily available.
Lastly, the dissertation presents an innovative pipeline tailored for unsupervised depth
estimation in autonomous driving contexts. Extensive empirical analyses affirm the robustness
and efficacy of this newly proposed pipeline. Collectively, these contributions
lay a robust foundation for the widespread adoption of vision-based 3D perception technologies
in autonomous driving applications.
Post a Comment