Improving deep conversational models via input augmentation and data source expansion

HKUST Electronic Theses

Improving deep conversational models via input augmentation and data source expansion

by Zhiliang Tian

THESIS 2022

Ph.D. Computer Science and Engineering

1 online resource (xvi, 126 pages) : illustrations (chiefly color)

Abstract

Conversational models aim to generate readable textual responses to input queries. In recent years, conversational models are typically built using deep neural networks, which are called deep conversational models (DCMs). DCMs can be used for task-oriented dialogues or chit-chat conversations. In this thesis, we investigate ways to enhance DCMs for chit-chat conversations via input augmentation and data source expansion.

DCM maps an input query to some output responses. It has been observed in previous work that the output is often uninformative and lacks diversity due to two reasons: (1) The input does not contain sufficient information to determine an appropriate output, and (2) the DCM is trained via likelihood maximization and hence captures only the most salient input-output relationships. One common method to address the first issue is to include background documents as additional inputs to the DCM, and one popular way to deal with the second issue is to include retrieved responses to similar input queries as additional inputs to the DCM. In this thesis, we advance the state-of-the-art in both of those two lines of work. For the first line of work, we propose an output-anticipated memory module to enable the DCM to better attend to the relevant information in the background documents. For the second line of work, we develop a memory module to extract relationships between clusters of similar inputs and clusters of outputs (which are more robust than relationships between individual inputs and outputs), and use the relationships to improve the performance of the DCM.

Nowadays, DCMs are often trained in huge corpora. However, there are still scenarios with low resources. One example is an online chatbot that needs to quickly adapt to a new user after a few rounds of conversations. Our third contribution in this thesis is a meta-learning-based method to help with the adaptation by utilizing data from the user's friends, who are expected to have similar interests and expectations. There are also applications, such as conversations on airline booking, where there are limited public data and abundant private data containing sensitive information. Our fourth contribution to this thesis is a teacher-student framework to train DCMs on both private and public data while ensuring the privacy of sensitive information in the private data.

[ Hide abstract ]

View Copyrighted to the author. Reproduction is prohibited without the author’s prior written consent.

Details

Collection HKUST Electronic Theses Degree Ph.D. Department Computer Science and Engineering Supervisors Zhang, Nevin Lianwen Authors Tian, Zhiliang Subjects Natural language processing (Computer science) Online chat groups Data processing Real-time data processing Deep learning (Machine learning) Language English Call number Thesis CSE 2022 TianZ DOI 10.14711/thesis-991013098358703412

Full record

Improving deep conversational models via input augmentation and data source expansion

by Zhiliang Tian

Post a Comment Cancel reply