Sublinear algorithms for massive data

HKUST Electronic Theses

Sublinear algorithms for massive data

by Di Chen

THESIS 2017

Ph.D. Computer Science and Engineering

x, 107 pages : illustrations ; 30 cm

Abstract

Sublinear algorithms address the rapid growth in data volume with a simple yet powerful premise, that useful tasks can be performed with even fewer resources than required to simply store the data.

This thesis studies randomized algorithms for massive data. We either devise new algorithms, or improve analysis on existing algorithms, resulting in meaningful theoretical guarantees with sublinear space or communication.

First, we present an algorithm that uses sublinear communication to perform set reconciliation under a ‘noisy data’ model, where two data points shall be considered ‘the same’ when the distance between them is small, modelling tiny perturbations caused in data due to some form of noise.

The second is a 1-pass streaming algorithm for estimating the number of distinct...[ Read more ]

View Copyrighted to the author. Reproduction is prohibited without the author’s prior written consent.

Details

Collection HKUST Electronic Theses Degree Ph.D. Department Computer Science and Engineering Supervisors Golin, Mordecai J Authors Chen, Di Subjects Big data Management Mathematical models Analysis Language English Call number Thesis CSED 2017 Chen DOI 10.14711/thesis-b1778917

Full record

Sublinear algorithms for massive data

by Di Chen

Post a Comment Cancel reply