THESIS
2019
xiv, 117 pages : illustrations ; 30 cm
Abstract
The rapid advances in sensing technologies and large-scale computing infrastructures
lead to explosive growth in data. Spatio-temporal (ST) data, as a ubiquitous type of data,
is increasingly collected and extensively studied in various scientific domains such as geology,
climatology, sociology, and transportation science. This type of data is distinct
from others due to the simultaneous presence of spatial and temporal dimensions, which
substantially increases analysis complexity. Purely automatic data analysis techniques
are insufficient to handle such complexity immaculately. Humans not only have inherently
good senses for perceiving space and time but also possess creativity, flexibility, and
domain expertise. Hence, an appropriate method that involves these human traits int...[
Read more ]
The rapid advances in sensing technologies and large-scale computing infrastructures
lead to explosive growth in data. Spatio-temporal (ST) data, as a ubiquitous type of data,
is increasingly collected and extensively studied in various scientific domains such as geology,
climatology, sociology, and transportation science. This type of data is distinct
from others due to the simultaneous presence of spatial and temporal dimensions, which
substantially increases analysis complexity. Purely automatic data analysis techniques
are insufficient to handle such complexity immaculately. Humans not only have inherently
good senses for perceiving space and time but also possess creativity, flexibility, and
domain expertise. Hence, an appropriate method that involves these human traits into
automatic data analysis will be tremendously helpful.
In this thesis, we introduce three novel visual analysis techniques for ST data analysis
to demonstrate the benefits brought by the combination of automatic data analysis
techniques with interactive visualizations. First, we study how to solve a multi-criteria
decision-making problem in the spatial-temporal context that involves a vast solution
search space. We use optimal billboard location selection as our application scenario
and propose SmartAdP. This system integrates a novel visualization-driven data mining
model with tailored data index mechanisms to facilitate efficient solution formulation.
Several well-designed visualizations are also put forward to support optimal solution
identification. Second, we investigate how to detect and examine anomalous events
hidden behind a large number of spatial time series. We use air quality analysis as our
primary application scenario and present AQEyes. The system contains a unified end-to-end tunable machine learning pipeline that supports quick identification of anomalous
air pollution events. A set of novel visualization techniques are presented to facilitate efficient
exploration of air quality dynamics and examination of detected anomalous events.
Third, we research how to quickly identify ST patterns hidden within the subsets of large-scale
multidimensional ST datasets. We propose a novel tensor-based algorithm to allow
automatic slicing of data into homogeneous partitions and extracting latent patterns in
each partition for comparison and visual summarization. Based on the algorithm, we further
develop TPFlow, a system supporting a top-down, human-steerable, and progressive
partitioning workflow for level-of-detail multidimensional ST data exploration.
The effectiveness and usefulness of the above techniques are validated through case
studies on real-world datasets and interviews with domain experts. The proposed techniques
are not limited to the presented example application scenarios. They can be easily
adapted to other applications with similar problems as well.
Post a Comment