THESIS
2012
xv, 92 p. : ill. ; 30 cm
Abstract
Image segmentation refers to the process of grouping pixels into spatially continuous regions based on certain similarity measure. One could find its considerable applications in computer vision since it plays a key role in bridging low level information to high level semantic information. Segmentation remains one of the fundamental computer vision problems. Depending on whether there is human interaction, image segmentation can be divided into interactive and non-interactive ones. Based on the learning mode, image segmentation can be classified into supervised and unsupervised ones....[
Read more ]
Image segmentation refers to the process of grouping pixels into spatially continuous regions based on certain similarity measure. One could find its considerable applications in computer vision since it plays a key role in bridging low level information to high level semantic information. Segmentation remains one of the fundamental computer vision problems. Depending on whether there is human interaction, image segmentation can be divided into interactive and non-interactive ones. Based on the learning mode, image segmentation can be classified into supervised and unsupervised ones.
In this thesis, we address the problem of non-interactive, unsupervised image segmentation, aiming at grouping perceptually similar pixels or superpixels into regions. It is expected that segmentations can serve as intermediate input for many high level scene understanding tasks, and that better segmentation results potentially lead to more accurate interpretation of objects and scenes. However, the problem remains challenging in the sense that it is very difficult to model and design methods with segmentation performance that matches human level accuracy without any human interaction or strong prior - constraints that are often rare or even unavailable under real circumstances. In addition, finding a good image partitioning often results in the searching in a very complex or high dimensional solution space. So computational complexity becomes another key issue considering the real implementation of image segmentation algorithms.
Regarding these challenges, this research proposes novel models that improve the state of the art unsupervised segmentation performance within the scope of acceptable computational cost, where segmentation performance is subjectively evaluated. The purpose is to generate image partitioning results that subjectively suffer less from both oversegmentation and overmerging, both of which can seriously degrade the performance of subsequent applications. Our research effort mainly focuses on graph theory and mode seeking, on top of which better boundary estimation and clustering models are proposed for image segmentation. We believe graphs often carry important structural information. And we want to utilize this information to guide data clustering and image segmentation.
The segmentation accuracy depends largely on both the clustering performance and the designed feature. For clustering, we propose a graph-based contour finding method and minimum spanning tree (MST) embedded mode seeking. An MST is a connected graph that preserves intrinsic compact structures of a data set. It is able to find manifold-like structures or elongated clusters since it shares common features with k-nearest neighbor (k-NN) graph - a graph that has been widely used in finding manifold structures. Thus the MST embedded mode seeking shows larger flexibility and tolerance on cluster shapes. On the image domain, MST serves as a good spatial smoothness constraint in image domain. The introduction of MST significantly helps to boost the performance of image segmentation and can be extended to 3D point cloud object segmentation. For segmentation feature, we propose ”Bag of Textons” - a histogram feature descriptor that conveys textual information. We further propose a novel mode seeking framework - called convex shift - to perform constrained mode seeking in histogram space. We show that each kernel shift step can be formulated as a constrained convex optimization problem. The proposed framework produces results that match or outperform the state of the art methods and is compatible with graph-embedding. It is expected that in combination with the embedding of an MST, a further boost in the segmentation performance can be achieved.
Post a Comment