Duplicate detection in XML Web data

HKUST Electronic Theses

by Huang Yuzhou

THESIS 2009

M.Phil. Computer Science and Engineering

i, x, 64 p. : ill. ; 30 cm

Abstract

Duplicate entities are quite common on the Web, where structured XML data are increasingly common. Duplicate detection, which is considered an important data cleaning task, consists of detecting different presentations of the same real world object. Detecting and resolving duplicate entities will certainly be of benefit to Web users. Thus, to improve the web data quality, algorithms for detecting duplicates are required....[ Read more ]

View Copyrighted to the author. Reproduction is prohibited without the author’s prior written consent.

Details

Collection HKUST Electronic Theses Degree M.Phil. Department Computer Science and Engineering Authors Huang, Yuzhou Subjects XML (Document markup language) Database management Language English Call number Thesis CSED 2009 Huang DOI 10.14711/thesis-b1054464

Full record

Duplicate detection in XML Web data

by Huang Yuzhou

Post a Comment Cancel reply