THESIS
2003
x, 49 leaves : ill. ; 30 cm
Abstract
In recent years, kernel methods have become popular and powerful tools in field of machine learning, with superior performance on many practical applications. In this thesis, I study kernel methods in both supervised and unsupervised learning. First, in using the ∈-support vector regression (∈-SVR) algorithm, one has to decide a suitable value for the insensitivity parameter ∈. Smola et al. considered its "optimal" choice by studying the statistical efficiency in a location parameter estimation problem. While they successfully predicted a linear scaling between the optimal ∈ and the noise in the data, their theoretically optimal value does not have a close match with its experimentally observed counterpart in the case of Gaussian noise. In this thesis, I attempt to better explain thei...[
Read more ]
In recent years, kernel methods have become popular and powerful tools in field of machine learning, with superior performance on many practical applications. In this thesis, I study kernel methods in both supervised and unsupervised learning. First, in using the ∈-support vector regression (∈-SVR) algorithm, one has to decide a suitable value for the insensitivity parameter ∈. Smola et al. considered its "optimal" choice by studying the statistical efficiency in a location parameter estimation problem. While they successfully predicted a linear scaling between the optimal ∈ and the noise in the data, their theoretically optimal value does not have a close match with its experimentally observed counterpart in the case of Gaussian noise. In this thesis, I attempt to better explain their experimental results by studying the regression problem itself. This resultant predicted choice of ∈ is much closer to the experimentally observed optimal value, while again demonstrating a linear trend with the input noise.
In the second part of this thesis, I address the problem of finding the preimage of a feature vector in the feature space induced by a kernel. This is of central importance in some kernel applications, such as on using kernel principal component analysis (PCA) for image denoising. Unlike the traditional method in [17] which relies on nonlinear optimization, this proposed method directly finds the location of the pre-image based on distance constraints in the feature space. It is non-iterative, involves only linear algebra and does not suffer from numerical instability or local minimum problems. Evaluations on performing kernel PCA and kernel clustering show much improved performance.
Post a Comment