THESIS
2013
ix, 97 pages : illustrations ; 30 cm
Abstract
Consider a distributed system with k nodes, where each node holds a part of the data.
The goal is to extract some useful information from the entire data set or to compute
some functions over the data. We are interested in designing communication-efficient
algorithms and also characterizing the communication complexity for various problems.
We consider both a
flat network structure and more complicated tree networks.
In this thesis, we study some most important statistical summaries of the underlying
data, in particular item frequencies, heavy hitters, quantiles, and ε-approximations, which
are extensively studied in database, machine learning, computational geometry and data
mining. We provide general techniques for both designing efficient algorithms and proving
communica...[
Read more ]
Consider a distributed system with k nodes, where each node holds a part of the data.
The goal is to extract some useful information from the entire data set or to compute
some functions over the data. We are interested in designing communication-efficient
algorithms and also characterizing the communication complexity for various problems.
We consider both a
flat network structure and more complicated tree networks.
In this thesis, we study some most important statistical summaries of the underlying
data, in particular item frequencies, heavy hitters, quantiles, and ε-approximations, which
are extensively studied in database, machine learning, computational geometry and data
mining. We provide general techniques for both designing efficient algorithms and proving
communication lower bounds, with which we get almost tight bounds for these problems.
We also study graph problems in the distributed setting, where the edges of the
input graph is partitioned across k nodes. We show how to compute an approximate
maximum matching, one of most important graph problems, communication-efficiently
and prove a tight lower bound for this problem. To prove this lower bound, we develop
new techniques, which we believe will have a wide applicability to prove distributed
communication complexity for other graph problems.
Post a Comment