THESIS
2012
xii, 100 p. : ill. ; 30 cm
Abstract
Urban scene parsing, segmenting interested objects and identifying their categories in urban scenes, is a fundamental issue for the urban scene understanding. As a representative
of the constrained scene parsing task, it is closely related to many important applications
paid great attention recently, like 3D city modeling and autonomous vehicles navigation.
In this thesis, we investigate the methodology for the urban scene parsing task with images and scan data, as well as the parameter learning of random eld models which are
widely used to formulate various scene parsing tasks.
For the urban image parsing, we propose a nonparametric scene parsing method which
exploits the partial similarity between images, and a parametric scene parsing method,
the supervised label transfer m...[
Read more ]
Urban scene parsing, segmenting interested objects and identifying their categories in urban scenes, is a fundamental issue for the urban scene understanding. As a representative
of the constrained scene parsing task, it is closely related to many important applications
paid great attention recently, like 3D city modeling and autonomous vehicles navigation.
In this thesis, we investigate the methodology for the urban scene parsing task with images and scan data, as well as the parameter learning of random eld models which are
widely used to formulate various scene parsing tasks.
For the urban image parsing, we propose a nonparametric scene parsing method which
exploits the partial similarity between images, and a parametric scene parsing method,
the supervised label transfer method. The partial similarity based nonparametric method
involves no training process and reduces the inference problem in the scene parsing to
a matching problem. By contrast, the supervised label transfer method transforms the
inference problem in the scene parsing to a supervised matching problem, inheriting the
advantages of the nonparametric scene parsing methods.
With both images and scan data, we propose a novel joint image and scan data scene
parsing system which can be applied in large scale urban scenes. The proposed system
can automatically obtain necessary training data from the input data, which is usually
obtained through manually labeling in previous work. Then, an associative Hierarchical
CRF trained with the automatically obtained training data is adopted to jointly segment
images and scan point cloud by integrating both 3D geometry information and 2D image
appearance information. These proposed methods are evaluated and compared with some state-of-the-art methods on several public datasets and the real Google Street View data,
with encouraging performance achieved.
Last but not least, we propose an adaptive discriminative learning algorithm to learn
the parameters of the random field models which are widely used to formulate various
scene parsing tasks, from some given training data. The parameters are iteratively updated with an adaptive updating step-size by solving a structured prediction problem,
sharing similar updating form as the Projected Subgradient method. The proposed adaptive discriminative learning algorithm can achieve comparable performance as the classical StructSVM method and Projected Subgradient method, with significantly improved
learning efficiency.
Post a Comment