THESIS
2013
xiii, 83 pages : illustrations ; 30 cm
Abstract
With the pervasiveness of online multimedia content on the web, content-based image
retrieval (CBIR) has attracted more and more interest in recent years due to the unsatisfactory
performance of conventional concept-based image retrieval techniques based
on text-based annotations. Two promising CBIR approaches are based on bag-of-words
(BoW) models and topic models (such as the latent Dirichlet allocation, or LDA, model)
as inspired by their success in text-based information retrieval applications. However,
BoW models do not consider the spatial relationships and latent semantic relationships
between words. Even though topic models take into consideration the semantic information
in documents, they still ignore the spatial information. Recent years have seen
the emergence of ne...[
Read more ]
With the pervasiveness of online multimedia content on the web, content-based image
retrieval (CBIR) has attracted more and more interest in recent years due to the unsatisfactory
performance of conventional concept-based image retrieval techniques based
on text-based annotations. Two promising CBIR approaches are based on bag-of-words
(BoW) models and topic models (such as the latent Dirichlet allocation, or LDA, model)
as inspired by their success in text-based information retrieval applications. However,
BoW models do not consider the spatial relationships and latent semantic relationships
between words. Even though topic models take into consideration the semantic information
in documents, they still ignore the spatial information. Recent years have seen
the emergence of new methods which attempt to remedy the shortcomings of these two
approaches. This thesis starts with a review of recent CBIR approaches that incorporate
spatial and semantic information. With this as the background, we propose two methods to combine the LDA model with spatial information. The first method, referred to as
BoP+LDA, combines an extension of the BoW representation called bag-of-phrases (BoP)
with the LDA model. The second method, called ssLDA, incorporates image segmentation
into the LDA model and re-ranks the retrieved images by exploiting topic spatial
consistency. We empirically compare our proposed methods with some baseline methods
on real-world image data. From the experimental results, we conclude that incorporating
both spatial and semantic information is effective in improving the image retrieval
performance.
Post a Comment