Techniques and applications of random sampling on massive data

HKUST Electronic Theses

Techniques and applications of random sampling on massive data

by Lu Wang

THESIS 2015

Ph.D. Computer Science and Engineering

x, 121 pages : illustrations ; 30 cm

Abstract

Living in the era of big data, we often need to process and analyze data sets that have never been so large and fast-growing. Random sampling has thus received much attention as an effective tool for turning big data “small”. It allows us to significantly reduce the size of input while maintaining the main features of the original data set we need. It is also easy to trade off between the computation complexity and the accuracy of the result, by tweaking the sample size.

Although random sampling is a classical problem with a long history, it has received revived attention lately motivated by new applications as well as new constraints in the big data era. This thesis presents several new techniques and applications of random sampling: (1) a new randomized streaming algorithm for findin...[ Read more ]

View Copyrighted to the author. Reproduction is prohibited without the author’s prior written consent.

Details

Collection HKUST Electronic Theses Degree Ph.D. Department Computer Science and Engineering Supervisors Yi, Ke Authors Wang, Lu Subjects Big data Sampling (Statistics) Language English Call number Thesis CSED 2015 WangL DOI 10.14711/thesis-b1514598

Full record

Techniques and applications of random sampling on massive data

by Lu Wang

Post a Comment Cancel reply