THESIS
2006
xi, 77 leaves : ill. ; 30 cm
Abstract
The World Wide Web (the Web) is serving an increasingly large and diversi-fied user community. The diversity of user interests makes it difficult for a general Web search engine to meet the needs of an individual user. This thesis addresses the problem of Web search engine personalization. The main objectives of study-ing the personalization are to understand a user's preference and to provide the searched information that satisfies that preference. We present a new approach that mines users' preferences on the search results from clickthrough data and adapts the search engine's ranking function to improve the search quality....[
Read more ]
The World Wide Web (the Web) is serving an increasingly large and diversi-fied user community. The diversity of user interests makes it difficult for a general Web search engine to meet the needs of an individual user. This thesis addresses the problem of Web search engine personalization. The main objectives of study-ing the personalization are to understand a user's preference and to provide the searched information that satisfies that preference. We present a new approach that mines users' preferences on the search results from clickthrough data and adapts the search engine's ranking function to improve the search quality.
Existing preference mining algorithms are typically based on strong assump-tions on how users scan the search results. Thus, the preferences derived are often incorrect. In this thesis, we develop a new preference mining technique called SpyNB, which is based on a more reasonable assumption that the search results clicked by a user reflect the user's preference, but it does not make any conclusions about those that the user did not click. As such, SpyNB remains valid even if the user does not follow any order in reading the search results or has not clicked on all relevant results.
We developed in this thesis a spying process to infer the negative examples by first treating the result items clicked by the users as sure positive examples and those not clicked by the users as unlabelled data. Then, we plant the sure positive examples (the spies) into the unlabelled set of result items and then apply Naïve Bayes classification to generate the reliable negative examples (thus the name "SpyNB"). These positive and negative examples allow us to discover highly accurate user preferences. Finally, we employ a ranking SVM to build a metasearch engine optimizer. The optimizer gradually adapts our metasearch engine according to the user's preference.
In order to verify the effectiveness of SpyNB for preference mining, we conduct both offline and online experiments. Our extensive offline experiments demon-strated that SpyNB discovers much more accurate preferences than the existing algorithms. Moreover, the adaptive ranking function derived from SpyNB im-proved retrieval quality by 20% compared to the case without learning. The interactive online experiments further confirmed that SpyNB and our personal-ization approach are effective in practice. We also showed that the efficiency of SpyNB is comparable to the existing simple preference mining algorithms.
Post a Comment