THESIS
1996
xiii, 114 leaves : ill. ; 30 cm
Abstract
Many promising rule discovery algorithms have been proposed. These algorithms use their proprietary ways to measure the goodness (or error) of rules. The goodness of rules is used to guide the search for the "best" rule set....[
Read more ]
Many promising rule discovery algorithms have been proposed. These algorithms use their proprietary ways to measure the goodness (or error) of rules. The goodness of rules is used to guide the search for the "best" rule set.
This thesis firstly investigates and compares theoretically and experimentally various such goodness (or error) measures: error count, mean square error, probability difference, mean square error sum, prediction factor, Quinlan's gain, and Clark's entropy.
Secondly, we study a way of estimating conditional probabilities for single rule, and - as a novelty - for rule sets. Results are presented on how conditional rule probabilities affect the goodness of the discovered knowledge.
The investigations are done using a general algorithm to discover non-propositional rules. This algorithm has minimal inter-dependency between different modules such as partial ordering (specialization) used to navigate in the search space, and the chosen error measure. Independent modules are varied and their effect on the discovered results studied.
Keywords: discovery, data mining, error measure, conditional probability, rules in database
Post a Comment