THESIS
2009
xi, 97 p. : ill. ; 30 cm
Abstract
The join operation combines information from multiple data sources. Efficient processing of join queries is a pivotal issue in most database systems. My PhD research focuses on joins in two categories of novel applications. The first is continuous joins in data streams. Specifically, we exploit two key properties of the streaming join. First, the initial plan of a long query may gradually become inefficient due to changes in data characteristics. This necessitates dynamic plan migration, an online transition from the old plan to a more efficient one generated based on current statistics. The only known solutions MS and PT have some serious shortcomings. Hence, we propose HybMig, which combines their merits, and outperforms them on every aspect....[
Read more ]
The join operation combines information from multiple data sources. Efficient processing of join queries is a pivotal issue in most database systems. My PhD research focuses on joins in two categories of novel applications. The first is continuous joins in data streams. Specifically, we exploit two key properties of the streaming join. First, the initial plan of a long query may gradually become inefficient due to changes in data characteristics. This necessitates dynamic plan migration, an online transition from the old plan to a more efficient one generated based on current statistics. The only known solutions MS and PT have some serious shortcomings. Hence, we propose HybMig, which combines their merits, and outperforms them on every aspect.
Another important property is that an output tuple from an upstream join (called the producer) may never generate any result in downstream operators (the consumers) during its entire lifespan. Motivated by this, we propose just-in-time (JIT) processing, a novel methodology that enables a producer to selectively generate outputs based on feedback returned from consumers that express their current demand. Extensive experiments show that JIT achieves significant savings in terms of both CPU time and memory consumption.
The second class of joins studied in this thesis are authenticated ones in outsourced databases. In particular, database outsourcing requires that the query server constructs a proof of result correctness, which can be verified by the client using the data owner’s signature. Addressing such queries, we propose a comprehensive set of new solutions that cover the entire spectrum of index availability. Furthermore, we extend them to authenticate complex queries, involving multi-way joins and other relational operators. Our experiments demonstrate that, the proposed methods outperform two existing benchmark solutions, often by orders of magnitude.
Post a Comment