On the importance and challenges of the experimental design of multilingual toxic content detection

HKUST Electronic Theses

On the importance and challenges of the experimental design of multilingual toxic content detection

by Nedjma Djouhra Ousidhoum

THESIS 2021

Ph.D. Computer Science and Engineering

1 online resource (xix, 82 pages) ; illustrations (some color)

Abstract

With the expanding use of social media platforms such as Twitter and the amount of text data generated online, hate speech and toxic language have been proven to negatively affect individuals in general, and marginalized communities in particular. In order to improve the online moderation process, there has been an increasing need for accurate detection tools which do not only flag bad words but rather help to filter out toxic content in a more nuanced fashion. Hence, a problem of central importance is to acquire data of better quality in order to train toxic content detection models. However, the absence of a universal definition of hate speech makes the collection process hard and the training corpora sparse, imbalanced, and challenging for current machine learning techniques. In this thesis, we address the problem of automatic toxic content detection along three main axes: (1) the construction of resources lacking in robust toxic language and hate speech detection systems, (2) the study of bias in hate speech and toxic language classifiers, and (3) the assessment of inherent toxicity and harmful biases within NLP systems by looking into Large Pre-trained Language Models (PTLMs), which are at the core of these systems.

In order to train a multi-cultural, fine-grained hate speech and toxic content detection system, we have built a new multi-aspect hate speech dataset in English, French, and Arabic. We also provide a detailed annotation scheme, which indicates (a) whether a tweet is direct or indirect; (b) whether it is offensive, disrespectful, hateful, fearful out of ignorance, abusive, or normal; (c) the attribute based on which it discriminates against an individual or a group of people; (d) the name of this group; and (e) how annotators feel about this tweet given a range of negative to neutral sentiments. We define classification tasks based on each labeled aspect and use multi-task learning to investigate how such a paradigm can improve the detection process.

Unsurprisingly, when testing the detection system, the imbalanced data along with implicit toxic content and misleading instances has resulted in false positives and false negatives. We examine misclassification instances due to the frequently neglected yet deep-rooted selection bias caused by the data collection process. In contrast to work on bias, which typically focuses on the classification performance, we investigate another source of bias and present two language and label-agnostic evaluation metrics based on topic models and semantic similarity measures to evaluate the extent of such a problem on various datasets. Furthermore, since we generally focus on English and overlook other languages, we notice a gap in content moderation across languages and cultures, especially in low-resource settings. Hence, we leverage the observed differences and correlations across languages, datasets, and annotation schemes to carry a study on multilingual toxic language data and how people react to it.

Finally, social media posts are part of the training data of Large Pre-trained Language Models (PTLMs), which are at the center of all major NLP systems nowadays. Despite their incontestable usefulness and effectiveness, PTLMs have been shown to carry and reproduce harmful biases due to the sources of their training data among other reasons. We propose a methodology to probe the potentially toxic content that they convey with respect to a set of templates, and report how often they enable toxicity towards specific communities in English, French, and Arabic.

The results presented in this thesis show that, despite the complexity of such tasks, there are promising paths to explore in order to improve the automatic detection, evaluation, and eventually mitigation of toxic content in NLP.

[ Hide abstract ]

View Copyrighted to the author. Reproduction is prohibited without the author’s prior written consent.

Details

Collection HKUST Electronic Theses Degree Ph.D. Department Computer Science and Engineering Supervisors Song, Yangqiu Yeung, Dit-Yan Authors Ousidhoum, Nedjma Djouhra Subjects Content analysis (Communication) Data processing Natural language processing (Computer science) Cross-language information retrieval Language English Call number Thesis CSE 2021 Ousidhoum DOI 10.14711/thesis-991012980216703412

Full record

On the importance and challenges of the experimental design of multilingual toxic content detection

by Nedjma Djouhra Ousidhoum

Post a Comment Cancel reply