THESIS
2012
xiii, 154 p. : ill. ; 30 cm
Abstract
With the rapid growth of web-based applications, mining personalized preferences
for promotion becomes a hot topic. In this thesis, we focus on two problems related to
user preferences: understanding user preferences and utilizing user preferences.
In understanding user preferences, we propose two sub-problems when we consider
temporal user preferences. The first sub-problem is called a̲ttribute-based s̲ubsequence m̲atching (ASM) : given a query sequence and a set of sequences, considering the attributes
of elements, we want to find all the sequences which are matched by this query
sequence. We propose an efficient algorithm for problem ASM by applying the Chinese
Remainder Theorem. The second sub-problem is to find all the attribute-based
frequent subsequences. We adapt an exis...[
Read more ]
With the rapid growth of web-based applications, mining personalized preferences
for promotion becomes a hot topic. In this thesis, we focus on two problems related to
user preferences: understanding user preferences and utilizing user preferences.
In understanding user preferences, we propose two sub-problems when we consider
temporal user preferences. The first sub-problem is called a̲ttribute-based s̲ubsequence m̲atching (ASM) : given a query sequence and a set of sequences, considering the attributes
of elements, we want to find all the sequences which are matched by this query
sequence. We propose an efficient algorithm for problem ASM by applying the Chinese
Remainder Theorem. The second sub-problem is to find all the attribute-based
frequent subsequences. We adapt an existing efficient algorithm for this second sub-problem
to show that we can use the algorithm developed for the first sub-problem.
Experimental results show that frequent subsequences reflect user preferences, and our
algorithms are scalable in large datasets. This work can stimulate a lot of existing data
mining problems which are fundamentally based on subsequence matching.
In utilizing user preferences, we identify and tackle three sub-problems, finding
top-k profitable products, finding top-k popular products, and finding K-dominating
competitive price. The former two sub-problems are about designing new products, while the latter one is about pricing new products.
In finding top-k profitable products, we consider generalized user preferences, derived
from the skyline concept. We propose a dynamic programming approach which
can find the optimal solution when there are two attributes to be considered. We show
that this problem is NP-hard when there are more than two attributes. Two greedy algorithms
are proposed and one of them has theoretical bounds. We also consider this
problem on dynamic datasets and propose update algorithms for different update operations.
We extend this problem by considering another form of customer preferences,
namely tolerant customer preferences in finding top-k popular products. We prove that
this problem is NP-hard and propose a 0.63-approximate algorithm for this problem.
Extensive experiments show the effectiveness and efficiency of our approaches on both
synthetic and real datasets.
In finding K-dominating competitive price in which we consider generalized user
preferences only, we propose an efficient algorithm. We utilize spatial properties for
pruning to speed up our algorithm. An extensive performance study using both synthetic
and real datasets is reported to verify its effectiveness and efficiency. We also
provide a real case study to show how our algorithm works in reality.
Post a Comment