THESIS
2021
1 online resource (xx, 184 pages) : illustrations (some color)
Abstract
Commonsense reasoning has long been a challenging yet vital artificial intelligence problem. In the past few decades, many efforts have been devoted to investigating how to represent, acquire, and apply commonsense knowledge to understand human language. Recently, with the help of large-scale pre-trained language models, the community has made significant progress on many commonsense reasoning benchmarks. However, due to the non-explainability of deep models, it is still unclear whether current models are solving these commonsense reasoning tasks with the correct reason or not. In this thesis, we first conduct experiments to show that even though current deep models can achieve high performance on the original WSC task, they cannot efficiently distinguish the right reasons.
An importan...[
Read more ]
Commonsense reasoning has long been a challenging yet vital artificial intelligence problem. In the past few decades, many efforts have been devoted to investigating how to represent, acquire, and apply commonsense knowledge to understand human language. Recently, with the help of large-scale pre-trained language models, the community has made significant progress on many commonsense reasoning benchmarks. However, due to the non-explainability of deep models, it is still unclear whether current models are solving these commonsense reasoning tasks with the correct reason or not. In this thesis, we first conduct experiments to show that even though current deep models can achieve high performance on the original WSC task, they cannot efficiently distinguish the right reasons.
An important reason behind this is that we are lack of a good commonsense representation methodology and a principled commonsense inference methodology. To fill this gap, in this thesis, we propose a new commonsense representation methodology, higher-order selectional preference over eventualities, to model the complex commonsense in our daily life. Following this principle, we developed a scalable eventuality-centric commonsense knowledge extraction pipeline. As a result, we created ASER, which is the largest eventuality-centric knowledge graph in the world.
Specifically, it contains 438 million eventualities and 648 million edges among them. Both intrinsic and extrinsic evaluations were conducted to demonstrate the high quality of ASER. We also conduct further experiments to prove the transferability from the selectional preference knowledge in ASER to human-defined commonsense.
On top of ASER, we also explored how to acquire commonsense knowledge from different modalities. Specifically, we first propose a multiplex word embedding model to acquire selectional preference knowledge from text more efficiently. After that, we also introduce how we can leverage the help of analogy and eventuality conceptualization to generalize the knowledge about observed eventualities to unseen ones. Last but not least, we investigate the possibility of acquiring causal knowledge about daily eventualities from the visual signal.
After collecting the knowledge, we also need to apply the structured commonsense for downstream natural language understanding tasks. Thus, in the last part of this thesis, we first use pronoun coreference resolution as the downstream task to investigate how to jointly use the structured knowledge and language representation models for better language understanding. In the end, to explore a commonsense inference model that can be applied to all downstream commonsense reasoning tasks, we propose a novel learning paradigm: commonsense knowledge base commonsense reasoning (CKBQA). Experiments on CKBQA show that even though inference over commonsense knowledge is challenging, models can learn to conduct simple inference after training with a few examples. Besides that, the learned model also demonstrates the generalization ability across tasks, which was not observed in previous commonsense reasoning models.
Post a Comment