Distant domain transfer learning

HKUST Electronic Theses

by Ben Tan

THESIS 2017

Ph.D. Computer Science and Engineering

xv, 140 pages : illustrations ; 30 cm

Abstract

Transfer learning adapts and reuses knowledge from source domains for a target domain. It has attained much popularity in data mining and machine learning, as well as many other areas. A major assumption in many transfer learning algorithms is that the source and target domains should be closely related. This relation can be in the form of related instances, features or models, and measured by the KL-divergence or A-distance. However, if two domains are not directly related, performing knowledge transfer between these domains will not be effective. This source-target domain gap is a serious impediment to the successful application of transfer learning.

In this thesis, we study a novel learning problem: Distant Domain Transfer Learning (abbreviated to DDTL). In DDTL, we aim to break the large domain gaps and transfer knowledge even if the source and target domains share few factors directly. For example, the source domain contains plenty of labeled text documents but the target domain is composed of image data, they have completely different feature spaces; or the source domain classifies face images but the target domain distinguishes plane images, they do not share any common characteristic in shape or other aspects, they are conceptually distant. The DDTL problem is critical and important as solving it can largely expand the application scope of transfer learning and help reuse as much previous knowledge as possible. Nonetheless, this is a difficult problem as the distribution gap between the source domain and the target domain is large.

Inspired by human transitive inference and learning ability, whereby two seemingly unrelated concepts can be connected by a string of intermediate bridges using auxiliary concepts, in this thesis we propose a novel learning framework: transitive transfer learning (abbreviated to TTL). The main idea of TTL is to transfer knowledge between distant domains by using some auxiliary intermediate data as a bridge. The distant domains can have heterogeneous feature spaces or homogeneous feature spaces but distant characteristics, and they can be connected by one or multiple intermediate domains. In this thesis, we also propose several learning algorithms under the TTL framework, including the instance-based, feature-based and model-based algorithms, to tackle the DDTL problem with different problem settings, and verify the proposed algorithms on some real world data sets.

[ Hide abstract ]

View Copyrighted to the author. Reproduction is prohibited without the author’s prior written consent.

Details

Collection HKUST Electronic Theses Degree Ph.D. Department Computer Science and Engineering Supervisors Yang, Qiang Authors Tan, Ben Subjects Machine learning Adaptive computing systems Internet domain names Language English Call number Thesis CSED 2017 Tan DOI 10.14711/thesis-991012535962503412

Full record

Distant domain transfer learning

by Ben Tan

Post a Comment Cancel reply