THESIS
2023
1 online resource (xv, 118 pages) : illustrations (some color)
Abstract
Understanding human languages requires the ability to reason about rich commonsense
knowledge concerning everyday concepts and events. Recent advances have been made in
leveraging linguistically pattern-based methods for automatic commonsense knowledge
acquisition. Compared with crowdsourced annotations, these methods can significantly
reduce human labeling efforts but still have several limitations when scaling up in different
scenarios.
In this thesis, we investigate ways to improve pattern-based knowledge extraction
at scale. First, we confirm the inherent low-recall issues of pattern-based methods for
hypernymy prediction tasks and propose a complementary framework that utilizes contextualized
representations to supplement semantic information. Experimental results
demonstrate the s...[
Read more ]
Understanding human languages requires the ability to reason about rich commonsense
knowledge concerning everyday concepts and events. Recent advances have been made in
leveraging linguistically pattern-based methods for automatic commonsense knowledge
acquisition. Compared with crowdsourced annotations, these methods can significantly
reduce human labeling efforts but still have several limitations when scaling up in different
scenarios.
In this thesis, we investigate ways to improve pattern-based knowledge extraction
at scale. First, we confirm the inherent low-recall issues of pattern-based methods for
hypernymy prediction tasks and propose a complementary framework that utilizes contextualized
representations to supplement semantic information. Experimental results
demonstrate the superiority of this approach for term pairs that are not covered by patterns.
Next, we argue that patterns cannot be easily generalized across different languages, and
creating high-quality annotation benchmarks is time-consuming, especially for low-resource
languages. We explore different cross-lingual and multilingual training paradigms and
find that meta-learning can effectively transfer knowledge from high-resource languages
to low-resource ones. Furthermore, extending general patterns to specific domains like
e-commerce is infeasible. E-commerce commonsense regarding user shopping intentions is
not explicitly stated in the products’ metadata but can be mined from vast amounts of user interaction behaviors. We propose a novel framework to distill intention knowledge
by explaining co-purchase behaviors with the help of large language models and human-in-the-loop annotations. Intrinsic and extrinsic evaluations demonstrate the effectiveness of
our proposed framework.
After harvesting large-scale structured commonsense knowledge, how to better incorporate
it for downstream tasks becomes crucial. Considering the high-order information
stored in the knowledge graph, we propose injecting complex commonsense knowledge
obtained from random walk paths into pretrained language models like BERT. We design
advanced masking strategies and new training objectives for effective knowledge fusion.
Lastly, we revisit the evaluations of knowledge fusion on natural language understanding
tasks and find that even fusing wrong or random knowledge can achieve comparable or
better performance, which calls for fair and faithful evaluations in the future.
Post a Comment