Entity resolution for hidden Web data

HKUST Electronic Theses

Entity resolution for hidden Web data

by Xie Xiaoheng

THESIS 2012

Ph.D. Computer Science and Engineering

xiii, 123 p. : ill. ; 30 cm

Abstract

Entity resolution (ER) identifies and merges records judged to represent the same real-world entity. With the development of the Internet, ER for hidden Web data has become increasingly important in many real-world applications such as online search engines, web data integration and so on. Hidden Web data often originates from different data sources that usually have different schemas. As a consequence, there is no one most efficient way to compare and merge records from different schemas. Moreover, the existing proposed techniques that put all records together under a unified schema are often not suitable.

In this thesis, we investigate ER methods for hidden Web data using a multi-schema approach. That is, we keep the data under the original schemas instead of placing them un...[ Read more ]

View Copyrighted to the author. Reproduction is prohibited without the author’s prior written consent.

Details

Collection HKUST Electronic Theses Degree Ph.D. Department Computer Science and Engineering Supervisors Lochovksy, Frederick H. Authors Xie, Xiaoheng Subjects Data mining Web databases Data processing Electronic data processing Language English Call number Thesis CSED 2012 Xie DOI 10.14711/thesis-b1198811

Full record

Entity resolution for hidden Web data

by Xie Xiaoheng

Post a Comment Cancel reply