Processing and management of uncertain information in vague databases

HKUST Electronic Theses

Processing and management of uncertain information in vague databases

by An Lu

THESIS 2009

Ph.D. Computer Science and Engineering

xiv, 161 p. : ill. ; 30 cm

Abstract

Uncertain information is common in many database applications due to intensive data dissemination arising from different pervasive computing sources, such as the high volume data obtained from sensor networks and mobile communications. In this thesis, we propose how to process and manage uncertain information in vague databases. Our work mainly focuses on four aspects: modelling uncertain information by vague sets, maintaining consistency in vague databases, extending SQL to query vague relations and mining vague association rules.

Modelling uncertain information by vague sets is the gravity of our work. We discuss how to measure vagueness in practice and the relationships between vague memberships and nulls. A new similarity measure of vague sets and the concepts of median membership (m) and imprecision membership (i) are proposed. Based on these two memberships, we define the notions of mi-overlap, mi-union and mi-intersection between vague sets and the concepts of vague relations and vague databases.

Functional dependencies (FDs) and inclusion dependencies (INDs) are the most fundamental integrity constraints that arise in practice in relational databases. We utilize FDs and INDs to maintain the consistency of a vague database. First, we tackle the problem, given a vague relation r and a set of FDs F, of how to obtain the "best" approximation of r with respect to F when taking into account the median membership and the imprecision membership thresholds. Using these two thresholds of a vague set, we define a merge operation on r. Second, we consider, given a vague database d and a set of INDs N, how to obtain the minimal possible change in value-precision for d. Finally, we develop a vague chase procedure as a means to maintain consistency of d with respect to F and N.

Incorporating the notion of vague sets in relations, we propose vague SQL (VSQL), which is an extension of SQL for the vague relational model, and show that VSQL combines the capabilities of a standard SQL with the power of manipulating vague relations. VSQL allows users to formulate a wide range of queries on vague data.

Using vague sets, we address the limitations of traditional association rule (AR) mining, which only discovers the hidden relationship among the items that have been sold but ignores the items that are almost sold. For example, in many online shopping applications, such as Amazon and eBay, those items that have been browsed in detail or put into the basket but are not checked out (almost sold items) carry hesitation information, since customers are hesitating to buy them. We propose a new notion of vague association rules (VARs) and devise an efficient algorithm to mine the VARs.

[ Hide abstract ]

View Copyrighted to the author. Reproduction is prohibited without the author’s prior written consent.

Details

Collection HKUST Electronic Theses Degree Ph.D. Department Computer Science and Engineering Supervisors Ng, Wilfred Authors Lu, An Subjects Database management Uncertainty (Information theory) Language English Call number Thesis CSED 2009 Lu DOI 10.14711/thesis-b1070799

Full record

Processing and management of uncertain information in vague databases

by An Lu

Post a Comment Cancel reply