THESIS
2006
x, 47 leaves : ill. ; 30 cm
Abstract
In recent years, the Multiple-Instance Learning (MIL) problem is becoming more and more popular in the machine learning community. Each training object (bag) of the MIL problem is a set of patterns (instances). Label information is only associated with the bags, but not with their constituent instances. Moreover, a positive bag must have at least one positive instance, but may have many neg-ative instances. Since we can only access the label information of a bag and a positive bag may have many negative instances, MIL is more challenging than the traditional supervised learning (or single-instance learning). On the other hand, it is fruitful to study MIL, since many real-world problems such as drug activity prediction are inherently MI problems which cannot be generalized well under the...[
Read more ]
In recent years, the Multiple-Instance Learning (MIL) problem is becoming more and more popular in the machine learning community. Each training object (bag) of the MIL problem is a set of patterns (instances). Label information is only associated with the bags, but not with their constituent instances. Moreover, a positive bag must have at least one positive instance, but may have many neg-ative instances. Since we can only access the label information of a bag and a positive bag may have many negative instances, MIL is more challenging than the traditional supervised learning (or single-instance learning). On the other hand, it is fruitful to study MIL, since many real-world problems such as drug activity prediction are inherently MI problems which cannot be generalized well under the traditional single-instance learning model. In addition, the generaliza-tion performance of many single-instance learning problems, e.g., Content-based Image Retrieval (CBIR), are found to be improved when they are casted into an appropriate MIL representation.
In this thesis, I study MIL algorithms based on kernel methods. In particular, I focus on support vector machines, which have been highly successful in many machine learning problems. This thesis first discusses how to re-formulate the SVM to adapt to the MI problem setting by utilizing both the bag and instance information at the same time. After that, I propose how to define a MI kernel over bags based on the marginalizing kernel. The resulted bag kernel can then be used in a standard SVM. I also extend this marginalized kernel to the real-valued regression setting, which is more and more popular in the MIL community. Empirical results show that the proposed methods have better performance over various traditional methods.
Post a Comment