THESIS
2018
Abstract
Over the past few years the amount of information being processed by data management
systems has grown exponentially, due to various technological advancements. Thus,
substantial work has been focused on constructing novel summarization structures that
make it possible to handle large datasets with the compromise of estimation errors. Furthermore,
the rapid spread of GPS-enabled mobile devices and social networking have
recently led to the growth of Geo-Social Networks (GeoSNs). These have enabled novel
location-based social interactions through GeoSN queries, which extract useful information
combining both the social relationships and the current location of the users. In
this work, we first present numerous summarization structures, focusing on the cases of
Histograms and Ske...[
Read more ]
Over the past few years the amount of information being processed by data management
systems has grown exponentially, due to various technological advancements. Thus,
substantial work has been focused on constructing novel summarization structures that
make it possible to handle large datasets with the compromise of estimation errors. Furthermore,
the rapid spread of GPS-enabled mobile devices and social networking have
recently led to the growth of Geo-Social Networks (GeoSNs). These have enabled novel
location-based social interactions through GeoSN queries, which extract useful information
combining both the social relationships and the current location of the users. In
this work, we first present numerous summarization structures, focusing on the cases of
Histograms and Sketches. We highlight the most popular such structures and clarify
their applicability in estimating specific query types. In the second part of the thesis,
we introduce the notion of approximately answering queries on GeoSN and propose novel
hybrid structures that facilitate their size estimation. Finally, we examine and evaluate
the accuracy and efficiency of our proposed structures by conducting an extensive set of
experiments over real-world datasets.
Post a Comment