Improving humour recognition using mental lexicons, world knowledge, and joke structure

HKUST Electronic Theses

Improving humour recognition using mental lexicons, world knowledge, and joke structure

by Cattle Andrew Grant

THESIS 2018

Ph.D. Computer Science and Engineering

1 volume (unpaged) : illustrations ; 30 cm

Abstract

As natural language interfaces become more prevalent, the ability for computers to both understand and create humour becomes more important. Humour is a ubiquitous part of human communication. It can be used to make one's self more likeable, to defuse a tense situation, or just for pure entertainment. As modern digital virtual assistants such as Alexa, Cortana, Google Assistant, and Siri become more human-like, the ability to effectively recognise, interpret, and even produce humour becomes more important.

What makes humour such an exciting challenge is that it requires not only linguistic dexterity but also world/domain knowledge. Syntax, phonology, and semantics all play a role in making a joke funny. However, existing humour recognition works have typically taken a fairly basic view of joke semantics, structure, and world knowledge; treating jokes as unordered bags-of-words and simply computing word embedding similarities between all word pairs. This bears little resemblance to the way humans actually interpret humour.

This thesis addresses these shortcomings in three ways. First, we motivate the use of a semantic relatedness measure based on word associations for better capturing joke semantics. Furthermore, we present evidence that word associations outperform Word2Vec similarity on both humour classification and humour ranking tasks across several datasets. Word associations' focus on relatedness over similarity offers an increased flexibility and the ability to capture weaker, more tangential relationships between concepts. Word associations also better represent the way humans store their mental lexicons. We experiment with extracting word association features using both a graph-based method, which is efficient to calculate but suffers from coverage issues, and a more sophisticated word association strength prediction model, which is capable of predicting association strengths between arbitrary word pairs.

Second, we experiment with adding world knowledge to our humour recognition system through the inclusion of ConceptNet-derived features. ConceptNet is commonsense knowledge base capable of representing complex real-world relationships between concepts which are unlikely to be represented by more conventional knowledge representation features like word embeddings.

Finally, we explore the usefulness of humour anchors for incorporating joke structure. Specifically, we utilise automatic humour anchor extraction as a form of setup/punchline annotation and use this information to help target semantic features.

[ Hide abstract ]

View Copyrighted to the author. Reproduction is prohibited without the author’s prior written consent.

Details

Collection HKUST Electronic Theses Degree Ph.D. Department Computer Science and Engineering Supervisors Ma, Xiaojuan Authors Cattle, Andrew Grant Subjects Natural language processing (Computer science) Speech perception Computational linguistics Wit and humor Data processing Language English Call number Thesis CSED 2018 Cattle DOI 10.14711/thesis-991012671057703412

Full record

Improving humour recognition using mental lexicons, world knowledge, and joke structure

by Cattle Andrew Grant

Post a Comment Cancel reply