THESIS
2018
x pages, 1 unnumbered page, 114 pages : illustrations ; 30 cm
Abstract
Spatial data processing is an important field in database and data mining which has a
lot of real-world applications and serves as a crucial building block for other algorithms.
In the past few decades, a lot of research efforts were spent on this field. In a high-level
view, spatial data processing spans a large body of problems and has two dimensions,
namely metrics and problems. In terms of the metrics, the Euclidean space is the most
fundamental one and was intensively and thoroughly studied in the past. With the advance
of the geo-spatial positioning and the computer technologies, there are many new
metrics emerging such as the road networks and the terrain surface, where the techniques
in the traditional Euclidean space may not efficiently apply. As a result, there are man...[
Read more ]
Spatial data processing is an important field in database and data mining which has a
lot of real-world applications and serves as a crucial building block for other algorithms.
In the past few decades, a lot of research efforts were spent on this field. In a high-level
view, spatial data processing spans a large body of problems and has two dimensions,
namely metrics and problems. In terms of the metrics, the Euclidean space is the most
fundamental one and was intensively and thoroughly studied in the past. With the advance
of the geo-spatial positioning and the computer technologies, there are many new
metrics emerging such as the road networks and the terrain surface, where the techniques
in the traditional Euclidean space may not efficiently apply. As a result, there are many
open questions and research problems to be studied in these emerging metrics. In terms
of the problems, the shortest distance computing is considered as the fundamental problem
in each metric. Efficiently tackling the fundamental problem not only helps many
real-world applications, but also paves the way for many more advanced problems.
In this thesis, we push the frontier of the field of spatial data processing in both directions
aforementioned (i.e., metrics and problems). Specifically, we studied a fundamental
problem (i.e., shortest distance queries) on an emerging metric, namely terrain surface and
proposed a new problem namely Nearby-fit Spatial Keyword Queries (NSKQ, in short) in
the traditional Euclidean space. For the shortest distance queries, we proposed an indexing
structure, namely a distance oracle, for efficiently processing the query. By our experimental
results, we accelerated the query processing by up to 2-5 orders of magnitudes
compared with the state-of-the-art. This result renders many applications/algorithms
real-time and sheds light on the further research in this new metric. The NSKQ problem
which lies in the Euclidean space is an important problem which captures a key feature for
the POI needed in some real-world applications. Specifically, it ranks each POI not only
by its own attributes (i.e, location and the keywords contained) but also its nearby POIs.
This problem is challenging and was proved to be NP-hard. We proposed two approximate
algorithms with best tradeoff between the running time and the accuracy compared
with baselines considered. The experiments verified the efficiency and the effectiveness of
our techniques.
Post a Comment