Output-optimal massively parallel streaming joins

HKUST Electronic Theses

Output-optimal massively parallel streaming joins

by Serafeim Papadias

THESIS 2018

M.Phil. Computer Science and Engineering

ix, 66 pages : illustrations ; 30 cm

Abstract

The advent of big data caused huge, rapid and volatile data streams to emerge, pushing research community into designing both real-time Distributed Stream Processing Systems (DSPSs) and streaming algorithms that run on top of those systems. The DSPSs must exhibit a variety of features such as hight throughput and low latency processing of data streams. In the first part of this thesis, we present the state of the art DSPSs and describe certain features that make them unique. In the second part, we focus on the problem of join processing in the streaming context. Specifically, we present the first output-optimal join algorithm for stream join processing, called Streaming Randomized HyperCube (SRHC). The algorithm operates optimally in the presence of high skew, considering both...[ Read more ]

View Copyrighted to the author. Reproduction is prohibited without the author’s prior written consent.

Details

Collection HKUST Electronic Theses Degree M.Phil. Department Computer Science and Engineering Authors Papadias, Serafeim Subjects Big data Data processing Streaming technology (Telecommunications) Language English Call number Thesis CSED 2018 Papadi DOI 10.14711/thesis-991012659569103412

Full record

Output-optimal massively parallel streaming joins

by Serafeim Papadias

Post a Comment Cancel reply