THESIS
2019
xiii, 143 pages : illustrations ; 30 cm
Abstract
With the proliferation of geo-positioning and geo-tagging techniques, spatio-textual data that possess
both a geographical location and a textual description are gaining in prevalence. This development gives prominence to spatio-textual data analysis, which is an emerging research field and has both real-world and scientific applications. The research on spatio-textual data analysis consists of many different areas, such as spatial data mining (i.e., knowledge discovery in large spatial databases) and spatial keyword query processing. In the area of spatial data mining, we want to discover interesting, and previously unknown but potentially useful, patterns from large spatial
databases. For example, one type of spatial data mining is the spatial association mining, which finds the pat...[
Read more ]
With the proliferation of geo-positioning and geo-tagging techniques, spatio-textual data that possess
both a geographical location and a textual description are gaining in prevalence. This development gives prominence to spatio-textual data analysis, which is an emerging research field and has both real-world and scientific applications. The research on spatio-textual data analysis consists of many different areas, such as spatial data mining (i.e., knowledge discovery in large spatial databases) and spatial keyword query processing. In the area of spatial data mining, we want to discover interesting, and previously unknown but potentially useful, patterns from large spatial
databases. For example, one type of spatial data mining is the spatial association mining, which finds the patterns and rules that describe the implication of one or a set of features from another set of features in spatial databases. In the area of spatial keyword query processing, we want to process the query and return relevant objects as results. A typical query takes a location and a set of keywords as arguments and returns the single spatio-textual object that best matches the keywords and is close to the specified location.
In this thesis, we introduce co-location pattern mining which is one type of spatial data mining and collective spatial keyword query (CoSKQ) which is one type of spatial keyword queries. Both problems find the results from the spatio-textual database and adopt the concept of object set. For the co-location pattern mining problem, we develop a new support measure called Fraction-Score that overcome the weaknesses of the existing support measures for defining co-location patterns. To solve the problem based on Fraction-Score, we develop efficient algorithms which are significantly
faster than a baseline that adapts the state-of-the-art.
For the CoSKQ problem, we consider two directions. First, we design a unified cost function which generalizes the majority of existing cost functions for CoSKQ and develop a unified approach which works as well as (and sometimes better than) best-known approaches based on different cost functions. Second, we propose a new cost function called the maximum dot size cost which captures both the distances among objects in a set and a query as existing cost functions do and the inherent costs of the objects. We present an exact algorithm and an approximate algorithm
with a provable approximation bound for the problem. We conducted extensive experiments conducted on both real datasets and synthetic datasets, which verified all our proposed approaches and algorithms.
Post a Comment