THESIS
2019
xv, 170 pages : illustrations ; 30 cm
Abstract
Extracting interesting tuples from a large database is an important problem in multi-criteria decision
making. Two representative queries were proposed in the literature: top-k queries and skyline
queries. A top-k query requires users to specify their utility functions beforehand and then returns
k tuples to the users. A skyline query does not require any utility function from users but it puts
no control on the number of tuples returned to users. Recently, a regret minimization query was
proposed and received attention from the community because it does not require any utility function
from users and the output size is controllable, and thus, it avoids those deficiencies of top-k queries
and skyline queries. Specifically, it returns a small set of tuples such that any user’s fav...[
Read more ]
Extracting interesting tuples from a large database is an important problem in multi-criteria decision
making. Two representative queries were proposed in the literature: top-k queries and skyline
queries. A top-k query requires users to specify their utility functions beforehand and then returns
k tuples to the users. A skyline query does not require any utility function from users but it puts
no control on the number of tuples returned to users. Recently, a regret minimization query was
proposed and received attention from the community because it does not require any utility function
from users and the output size is controllable, and thus, it avoids those deficiencies of top-k queries
and skyline queries. Specifically, it returns a small set of tuples such that any user’s favorite tuple
in this subset is guaranteed to be not much worse than his/her favorite tuple in the whole database.
In a sense, this query finds a small set of tuples that makes the user happy (i.e., not regretful) even
if s/he gets the best tuple in the selected set but not the best tuple among all tuples in the database.
In this thesis, we first study two versions of the regret minimization query: (1) the min-error
version: we want to minimize the regret level of each user while guaranteeing the output size is
at most k where k is the maximum output size. This problem is also known as the k-regret query,
and (2) the min-size version: we want to determine the minimum number of tuples needed to keep
users happy at a given level. We term this problem as the α-happiness query where α represents
a happiness threshold. In both versions, we quantify the user’s regret level (and happiness level)
by a criterion, called the regret ratio (and the happiness ratio). A small regret ratio (i.e., a large
happiness ratio) indicates the user is satisfied with the set returned. In addition, we study how to
enhance the regret minimization query with user interactions: when presented with a small number
of tuples (which can be artificial tuples or true tuples inside the database), a user is asked to indicate
the tuple s/he favors the most among them. In particular, we are also interested in the special case
of determining the favorite tuple for a user in the entire database with a small amount of interaction,
measured by the number of questions we ask the user.
Finding the optimal solution for a regret minimization query is an NP-hard problem. We derive
efficient approximate solutions for both versions of regret minimization. We also present a generic
framework for interactive regret minimization, under which we propose algorithms that ask an
asymptotically optimal number of questions in 2-dimensional spaces and algorithms with provable
performance guarantees in d-dimensional spaces (d ≥ 2) where each dimension corresponds to a
description of a tuple. We performed extensive experiments using both real and synthetic datasets.
Our evaluations show that our algorithms outperform the best-known previous approaches in three
ways: (i) they answer the regret minimization queries by guaranteeing a smaller regret ratio and
returning fewer tuples to users, (ii) they do so much faster (up to two orders of magnitude times
improvement), and (iii) they outperform the existing interactive regret minimization algorithms by
locating the favorite tuple and guaranteeing a small regret ratio with much fewer questions.
Post a Comment