THESIS
2019
ix, 40 pages : illustrations ; 30 cm
Abstract
In generative dialog systems, learning representation for the dialog context is a crucial
step to generate high quality responses. The dialog systems are required to capture useful
and compact information from mutual dependent sentences such that the generation process
can effectively attend to the central semantics. Unfortunately, existing methods may
not well identify importance distributions for each lower position when computing an
upper level feature, which may lose critical information to constitute the final context representations.
To address the issue, we propose a transfer learning based method named
Transfer Hierarchical Attention Network (THAN). The THAN model can leverage useful
prior knowledge from two related auxiliary tasks, i.e., keyword extraction and sentence...[
Read more ]
In generative dialog systems, learning representation for the dialog context is a crucial
step to generate high quality responses. The dialog systems are required to capture useful
and compact information from mutual dependent sentences such that the generation process
can effectively attend to the central semantics. Unfortunately, existing methods may
not well identify importance distributions for each lower position when computing an
upper level feature, which may lose critical information to constitute the final context representations.
To address the issue, we propose a transfer learning based method named
Transfer Hierarchical Attention Network (THAN). The THAN model can leverage useful
prior knowledge from two related auxiliary tasks, i.e., keyword extraction and sentence
entailment, to facilitate the dialog representation learning for the main dialog generation
task. During the transfer process, the syntactic structure and semantic relationship from
the auxiliary tasks are distilled to enhance both the word-level and sentence-level attention
mechanisms for the dialog system. Empirically, extensive experiments on Twitter Dialog
Corpus and PERSONA-CHAT dataset demonstrate the effectiveness of the proposed
THAN model compared with the state-of-the-arts methods.
Post a Comment