Generative long-form question answering : relevance, faithfulness, and succinctness

HKUST Electronic Theses

Generative long-form question answering : relevance, faithfulness, and succinctness

by Dan Su

THESIS 2022

Ph.D. Electronic and Computer Engineering

1 online resource (xvii, 106 pages) : illustrations (some color)

Abstract

Question answering (QA) aims to build computer systems that can automatically answer questions posed by humans, and it has been a long-standing problem in natural language processing (NLP). This thesis investigates a particular problem of generative long-form question answering (LFQA), which aims to generate an in-depth, paragraph-length answer for a given question posed by a human.

Generative LFQA is an important task. A large ratio of the questions that humans deal with daily and ask on search engines are complicated why/how types, which require multi-sentence explanations to answer. For example, ’How do jellyfish function without a brain?’, ’What are the risking factors related to COVID-19?’. Furthermore, the answers normally need to be generated by synthesizing information from mult...[ Read more ]

On the other hand, LFQA is quite challenging and under-explored. Few works have been done to build an effective LFQA system. It is even more challenging to generate a good-quality long-form answer relevant to the query and faithful to facts, since a considerable amount of redundant, complementary, or contradictory information will be contained in the retrieved documents. Moreover, no prior work has been investigated to generate succinct answers.

In this thesis, we investigate the task of LFQA and tackle the challenges mentioned above. Specifically, we focus on 1) how to build a practical application for real-time open-domain LFQA, and generate more query-relevant answers, 2) how to generate more factual long-form answers, and 3) how to generate succinct answers from long-form answers.

To elaborate, we first present a coarse-to-fine method to extract the document-level and sentence-level query-relevant information, to help a traditional Seq2Seq model to handle long and multiple documents as input, and consider query relevance. We further introduce QFS-BART, a model that incorporates the explicit answer relevance attention of the source documents into the generation model’s encoder-decoder attention module, to further enhance the query relevance. The CAiRE-COVID system, a real-time long-form question answering system for COVID-19, that we built has won one of the Kaggle competitions related to COVID-19, judged by medical experts.

Secondly, we present a new architectural method to tackle the answer faithfulness issue. We augment the generation process with global predicted salient information from multiple source documents, which can be viewed as an emphasis on answer-related facts. State-of-the-art results on two LFQA datasets demonstrate the effectiveness of our method in comparison to solid baselines on automatic and human evaluation metrics. The method also topped one public leaderboard on the LFQA task.

Finally, we take a step further and propose to generate succinct answers from the long-form answers. Specifically, we extract short-phrase answers for closed-book question answering (CBQA) task from the long-form answers. Experimental results on three QA benchmarks show that our method significantly outperforms previous closed-book QA methods and is on par with traditional open-book methods that exploit external knowledge sources.

[ Hide abstract ]

View Copyrighted to the author. Reproduction is prohibited without the author’s prior written consent.

Details

Collection HKUST Electronic Theses Degree Ph.D. Department Electronic and Computer Engineering Supervisors Fung, Pascale Authors Su, Dan Subjects Question-answering systems Data processing Natural language processing (Computer science) Information storage and retrieval systems Language English Call number Thesis ECE 2022 Su DOI 10.14711/thesis-991013106451503412

Full record

Generative long-form question answering : relevance, faithfulness, and succinctness

by Dan Su

Post a Comment Cancel reply