THESIS
2022
1 online resource (x, 88 pages) : illustrations (some color)
Abstract
This work presents a air quality prediction model and a corresponding visualization system for Hong Kong air pollution prediction. The air quality prediction model is built by a deep learning neuron networks, and the visualization system establishes an user-friendly interface for both normal users and domain experts.
Air pollution is a fast-going problem in the urbanized society. Bad air quality has a significant impact on human health. As an international metropolis with heavy population, Hong Kong encounters air pollution problem inevitably. Thus, air pollution prediction is quite necessary for either Hong Kong government to take measure to control air quality or Hong Kong people to protect their health. In this study, a spatial-temporal deep learning model based on the attention mec...[
Read more ]
This work presents a air quality prediction model and a corresponding visualization system for Hong Kong air pollution prediction. The air quality prediction model is built by a deep learning neuron networks, and the visualization system establishes an user-friendly interface for both normal users and domain experts.
Air pollution is a fast-going problem in the urbanized society. Bad air quality has a significant impact on human health. As an international metropolis with heavy population, Hong Kong encounters air pollution problem inevitably. Thus, air pollution prediction is quite necessary for either Hong Kong government to take measure to control air quality or Hong Kong people to protect their health. In this study, a spatial-temporal deep learning model based on the attention mechanism assisted encoder-decoder and 1D CNN architecture is introduced. The prediction is about the concentration of five target pollutants (i.e., O3, NO2, SO2, PM2.5 and PM10) and can cover the coming 12 hours at a fine-grained grid mesh of Hong Kong by using past 24 hours air quality monitoring stations’ data as the input.
The temporal model implemented in this study was constructed by a deep learning architecture called encoder-decoder, which has the advantage to handle the time-series dataset. Long Short-Term Memory (LSTM), an improved neuron of Recurrent Neural Network (RNN), is used as the stacked units in the encoder and decoder to extract the temporal relations; and the attention mechanism is used to enhance the encoder-decoder's ability to handle long-term evolution time series and increase the prediction accuracy.
The implemented spatial model was constructed by a 1D CNN network, which has the advantage to handle the image-like dataset. This deep learning model consists of three network layers, i.e., a convolutional layer, a pooling layer, and a fully connection layer. It can effectively extract the spacial features to infer the fine-grained air qualities based on the sparse dataset from the predictions of the temporal model and the spatial information such as the point of interest (POI).
The hyperparameters of the model were trained by the gradient descent method for the temporal and spatial model respectively. The mean squared error (MSE) was used as the loss function for training and index of agreement (IOA) was the accuracy indicator to evaluate the model performance. The combined spatial-temporal model was implemented with the input of hourly air quality data recorded by 16 air quality monitoring stations in Hong Kong. With the past 24 hours data, the model can predict the coming 12 hours' concentrations for five target pollutants at an arbitrary point in the grid mesh with 1km interval on the whole Hong Kong area. The accuracy of the prediction is good. The highest IOA can achieve 0.98.
A visualization system was also established to provide a user-friendly interface for both normal users and domain experts. The visualization system includes generating a 2D map for input parameters and output predictions, displaying the feature information with suitable circles and colors, and providing the tools for comparison and analysis. Normal users can find the visualized information they are interested in; domain experts can use the visualized tools for identification and analysis. A visualized labeling system was also introduced, which consisted of a heatmap, a line chart and a labeling interface to help domain experts in the environmental area to identify error pattern and label it in an easy way. The labeled data could be selectively used by the training process, which is very useful to refine the process and improve the prediction accuracy of the spatial-temporal model.
Post a Comment