THESIS
2006
xi, 66 leaves : ill. ; 30 cm
Abstract
Content-Based Image Retrieval (CBIR) is an important research area that can bring about significant implications to the retrieval of images. The basic idea of CBIR is to search for and retrieve images by presenting one or more query images as examples, instead of performing text-based search based on the metadata about the images. An important implication is that CBIR does not require the tedious, manual process of providing metadata about the images in the image database. Also, it allows the user to pose queries without having to know the textual descriptions of the images. Moreover, many useful features in images, such as the texture of wood, can hardly be described satisfactorily by metadata in the form of textual descriptions. CBIR, on the other hand, typically represents the conten...[
Read more ]
Content-Based Image Retrieval (CBIR) is an important research area that can bring about significant implications to the retrieval of images. The basic idea of CBIR is to search for and retrieve images by presenting one or more query images as examples, instead of performing text-based search based on the metadata about the images. An important implication is that CBIR does not require the tedious, manual process of providing metadata about the images in the image database. Also, it allows the user to pose queries without having to know the textual descriptions of the images. Moreover, many useful features in images, such as the texture of wood, can hardly be described satisfactorily by metadata in the form of textual descriptions. CBIR, on the other hand, typically represents the contents of images in the form of feature vectors. Unfortunately, sometimes such feature vectors cannot correctly and uniquely depict all the possible image interpretations from a human's perspective. This reveals the fact that low-level feature vector representations and high-level semantics of the images are not linked strongly, which is often referred to as the semantic gap problem.
To address the semantic gap problem, recent attempts have been made and they can roughly be categorized into two main approaches: metric learning and support vector machine (SVM). Metric learning methods aim at learning a better metric so that the corresponding dissimilarity measure can become more accurate in reflecting the underlying semantics. On the other hand, the SVM classification approach is primarily based on a relevance feedback (RF) process within each query session. We observe that these two approaches are complementary, and hence we propose in this thesis a CBIR framework which integrates the two approaches to get the best of both worlds. In many real-world applications, such as photo-taking, we notice that images that are close together in the image space are likely to be images taken at the same location with similar objects involved. Based on this observation, we relax the restrictions of metrics and develop an algorithm called Distance Function Learning with Neighborhood Generalization (DFLNG). By integrating DFLNG with SVM based active learning, we observe that the image retrieval effectiveness can be significantly increased.
Post a Comment