Building versatile conversational agents with unified generative networks

HKUST Electronic Theses

Building versatile conversational agents with unified generative networks

by Zhaojiang Lin

THESIS 2021

Ph.D. Electronic and Computer Engineering

1 online resource (xvi, 107 pages) : illustrations (some color)

Abstract

This thesis investigates a fully data-driven, end-to-end dialogue learning framework for building versatile conversational agents that are able to understand human emotion, converse with empathy, access external knowledge, and assist humans in complex tasks.

Dialogue systems, a.k.a. conversational agents, are computer systems designed to interact with humans in natural languages. Conventional dialogue systems are highly modularized, typically consisting of multiple components for language understanding, dialogue management, and natural language generation. Although these systems are known to be stable in well-designed domains, they rely on extensive domain-specific rules, which limit their flexibility and scalability. As an effective alternative to rule-based methods, data-driven end-to-end deep learning approaches have emerged. Neural dialogue systems, typically based on sequence-to-sequence architectures, can leverage a massive amount of conversational data to learn dialogue reasoning and response generation strategies jointly with minimal predefined rules and achieve promising performance.

Despite the rapid development of neural dialogue systems, various challenges remain. Firstly, large-scale conversational models trained on online conversations (e.g., Reddit) usually lack empathy and consistent characteristics. Secondly, the end-to-end architecture inherently makes it difficult for the models to interact with external knowledge and to complete tasks. Last but not least, most dialogue models are optimized to a single conversational skill such as empathy, and ignore others (e.g., knowledge).

In this thesis, we tackle these challenges and introduce a versatile dialogue system based on generative language pre-training and parameter-efficient transfer learning methods. And we show the effectiveness of our approaches in empathy modeling, knowledge acquisition, and continuous dialogue skills integration.

To endow dialogue models with empathy, we propose a two-stage multi-task fine-tuning approach. In the first stage, a pre-trained language model (e.g., GPT) is fine-tuned on a persona-aware dialogue dataset with response modeling and contrastive learning objective. In the second stage, the model is fine-tuned on empathetic conversations with a custom persona and an additional emotion recognition objective. Based on this method, we develop a web demo, namely the empathetic chatbot CAiRE, for interacting with real users and learning from user feedback.

To enable dialogue models to efficiently interface with external knowledge, we introduce a minimalist task-oriented dialogue learning framework (MinTL). Our approach leverages the prior knowledge of pre-trained language models and effective dialogue state tracking formulation for jointly learning user-model and model-knowledge_base interactions while requiring minimal human supervision. MinTL outperforms other baselines in both the full training and the simulated low resource settings. In addition, we extend MinTL to more realistic task-oriented dialogue scenarios and show the robustness of our approach.

Finally, we combine all the aforementioned dialogue skills into a unified versatile generative dialogue model, named AdapterBot. AdapterBot uses a fixed language model backbone for response generation and multiple lightweight residual adapters for modeling dialogue skills. Each adapter can be trained independently, thus allowing a continual integration of skills without retraining the entire model. We empirically show the competitive performance of AdapterBot on a diverse set of dialogue tasks.

[ Hide abstract ]

View Copyrighted to the author. Reproduction is prohibited without the author’s prior written consent.

Details

Collection HKUST Electronic Theses Degree Ph.D. Department Electronic and Computer Engineering Supervisors Fung, Pascale Authors Lin, Zhaojiang Language English Call number Thesis ECE 2021 Lin DOI 10.14711/thesis-991013080356303412

Full record

Building versatile conversational agents with unified generative networks

by Zhaojiang Lin

Post a Comment Cancel reply