Incorporating domain knowledge into big data : with application in smart manufacturing and transportation

HKUST Electronic Theses

Incorporating domain knowledge into big data : with application in smart manufacturing and transportation

by Ziyue Li

THESIS 2021

Ph.D. Industrial Engineering and Decision Analytics

1 online resource (xiv, 93 pages) : illustrations (some color)

Abstract

The mission of data mining is to discover the knowledge behind the data. Three typical knowledge is trend, cluster, and change, which derive three typical data mining tasks: regression, clustering, and detection. Amounts of studies ranging from mathematical models to deep learning frameworks have been proposed. However, a pure data mining model without domain or human knowledge might provide results that derail from reality. This thesis proposes that the combination of “Data + Domain/Human Knowledge” could potentially offer a better solution. Two major frameworks have been proposed: (1) Data-to-Data knowledge collaborating framework, and (2) Human-to-Data knowledge incorporating framework, with three projects conducted.

The first project is to learn the “change” in smart manufacturing, precisely to detect anomalies from a cold-start process, and a decomposition-based hybrid transfer learning framework is proposed to transfer knowledge from experienced domains to the cold-start domain. The knowledge transfer increases the anomaly detection accuracy in cold-start data by 20%.

The second project is to learn the “trend” in smart transportation, precisely to predict the passenger flow in a metro system. Human knowledge about the distances and the functional similarities between stations have been formulated as graphs and incorporated into the proposed low-rank tensor completion model. The incorporated graphs improve the prediction results by more than 30%.

The third project is to learn the “cluster” in smart transportation, precisely to learn the multiple clusters of origin, destination, time, and passengers from individual trajectory data. A tensor Latent Dirichlet Allocation (LDA) model is proposed with the external knowledge graphs about locations and functions of stations incorporated. The graph structure enhances the interpretability of learned clusters by more than 20%.

These essays provide a comprehensive solution for analytical data models coupling with domain and human knowledge, with detailed implementation in real case studies to prove the increased model accuracy, efficiency, and interpretability.

[ Hide abstract ]

View Copyrighted to the author. Reproduction is prohibited without the author’s prior written consent.

Details

Collection HKUST Electronic Theses Degree Ph.D. Department Industrial Engineering and Decision Analytics Supervisors Tsung, Fugee Authors Li, Ziyue Subjects Knowledge representation (Information theory) Manufacturing industries Data processing Transportation Data mining Language English Call number Thesis IELM 2021 LiZ DOI 10.14711/thesis-991012986004203412

Full record

Incorporating domain knowledge into big data : with application in smart manufacturing and transportation

by Ziyue Li

Post a Comment Cancel reply