THESIS
2016
Abstract
In this thesis, we propose Geo-Social Keyword (GSK) search, which enables the retrieval
of users, points of interest (POIs), or keywords that satisfy geographic, social, and/or textual
criteria. We first introduce a general GSK framework that covers a wide range of real-world
tasks, including advertisement, context-based search, and market analysis. Then, we
present three concrete GSK queries: (i) NPRU that returns the top-k users based on their
spatial proximity to a given query location, their popularity, and their similarity to an input
set of terms; (ii) NSTP that outputs the top-k POIs based on their proximity to a user v, the
number of check-ins by friends of v, and their similarity to a set of terms; (iii) FSKR that
discovers the top-k keywords based on their frequency in...[
Read more ]
In this thesis, we propose Geo-Social Keyword (GSK) search, which enables the retrieval
of users, points of interest (POIs), or keywords that satisfy geographic, social, and/or textual
criteria. We first introduce a general GSK framework that covers a wide range of real-world
tasks, including advertisement, context-based search, and market analysis. Then, we
present three concrete GSK queries: (i) NPRU that returns the top-k users based on their
spatial proximity to a given query location, their popularity, and their similarity to an input
set of terms; (ii) NSTP that outputs the top-k POIs based on their proximity to a user v, the
number of check-ins by friends of v, and their similarity to a set of terms; (iii) FSKR that
discovers the top-k keywords based on their frequency in pairs of friends located within a
spatial area. For each query, we develop a processing algorithm that utilizes a novel hybrid
index.
We further extend this framework to incorporate diversification in GSK search. Top-k
queries can give results that are very similar to each other, hence compromising their effectiveness
in various applications. For instance, returned users of an NPRU query may have
many common friends, in which case advertising budget would be wasted in repeatedly
promoting to the same set of users. In view of this limitation, we investigate diversification
and extend GSK to Diverse Geo-Social Keyword search (DGSK) by introducing a trade-off
between query-similarity and result-diversity. We show that this problem is NP-Complete
in our setting. We propose two greedy algorithms and provide theoretical analysis of their
approximation bounds. Finally, we evaluate our framework with thorough experiments
using real datasets.
Post a Comment