THESIS
1995
xix, 117 leaves : ill. ; 30 cm
Abstract
Object recognition is one of the main goals in computer vision. It is useful in task such as autonomous navigation, remote sensing and robotics. In order to recognize objects in a scene, a vision system would need to model the objects which may appear in the image. In other words, recognizing objects and modelling them are two non-separable issues....[
Read more ]
Object recognition is one of the main goals in computer vision. It is useful in task such as autonomous navigation, remote sensing and robotics. In order to recognize objects in a scene, a vision system would need to model the objects which may appear in the image. In other words, recognizing objects and modelling them are two non-separable issues.
A novel object recognition approach is proposed to recognize a complex scene and has been tested experimentally. A new object representation which is junctionoriented and is based on multiple views is also proposed. The internal object view representation is composed of junction type, region type and view type while the external object view is represented by a bounding box.
The importance of the adequacy of a 3-D representation model is addressed. This is rarely studied but it is essential if the model is to use for object recognition. The adequacy of a representation is necessary because it can ensure a single interpretation of a representation to the correct object view. If the interpretation is not unique, the same representation may be interpreted as two or more different object views and this may lead to incorrect recognition results.
Locating isolated objects in a scene can improve the efficiency of recognizing them. Therefore, in the recognition process, a bounding box of a model is used initially to detect and locate the view of an object model in the scene. A bounding box is a sequence of junction types on the convex hull of an object and is represented as a string in the representation. Approximate string matching is performed between the bounding boxes of models and those in a scene. Metrics including, length of the longest common substrings (LLCG), dissimilarity metric of LCG (D
LCG), edit distance (δ) and length of the longest common subsequences (LLCS), are proposed to measure the similarity between bounding boxes.
After locating isolated objects, the recognition process matches the internal structure of a model around that location. The matching is done through the reconstruction of selected object models using information from the scene in a region-wise fashion. If a model is successfully reconstructed from the scene, the matching is considered a success and the object in the scene is recognized. Incorrect initial correspondences which are obtained from approximate string matching can be corrected from information stored in the model. Different junction types would be used to correct the incorrect correspondences under different situtations.
Our proposed recognition algorithm can recognize and locate objects in a scene even if occlusions, noise and shadow, and imperfect detection of features exist in a scene. It will however fail if the initial model-scene correspondence process produces too many mistakes and these mistakes cannot be rectified through the reconstruction process.
Post a Comment