THESIS
2004
x, 64 leaves : ill. ; 30 cm
Abstract
Support vector data description (SVDD) is a powerful kernel method that has been commonly used for novelty detection. While its quadratic programming formulation has the important computational advantage of avoiding the problem of local minimum, this has a runtime complexity of Ο(N
3), where N is the number of training patterns. It thus becomes prohibitive when the data set is large....[
Read more ]
Support vector data description (SVDD) is a powerful kernel method that has been commonly used for novelty detection. While its quadratic programming formulation has the important computational advantage of avoiding the problem of local minimum, this has a runtime complexity of Ο(N
3), where N is the number of training patterns. It thus becomes prohibitive when the data set is large.
In the context of shape-fitting problems in computational geometry, core-sets have been commonly used to obtain approximations for the minimum enclosing ball problem. The runtime of core-set approxmation algorithms grows only linearly with N and the data dimensionality, and is relatively small compared to the other shape-fitting problems.
Inspired from such use of the core-sets, we propose an approximation algorithm, called Core-set SVDD (CSVDD), that allows SVDD to scale better to large data sets. Most importantly, the proposed method has a running time that is only linear in N.
Experimental results on two large real-world data sets demonstrate that the proposed method can handle data sets much larger than those that can be handled by standard SVDD packages, while its approximate solution still attains comparable, or sometimes even better, novelty detection performance.
An ensemble approach of CSVDD, called ensemble CSVDD (ECSVDD), is also proposed. This improves CSVDD's novelty detection performance and reduces its fluctuations, while still maintaining a running time that is only linear in N.
Post a Comment