Title

The Automatic Multi-Class Classification of Hate Speech on Twitter

Document Type

Thesis

Degree Name

Master of Science (MS)

Department

Computer Science and Info Sys

Date of Award

Fall 2021

Abstract

The spread of hate speech on social media is becoming a significant concern. To address this concern, various scholars from diverse disciplines such as sociology, legal studies and computer science have attempted to define, analyze and detect hate speech. However, hate speech has not been adequately addressed and analyzed as a sociolinguistic phenomenon. Therefore, the aim of this study is to shed more light on understanding hate speech as a sociolinguistic concept. To achieve this goal, three main phases have been performed. First, the study incorporates the theory of speech acts along with the existing academic and non-academic definitions of hate speech along to propose a more comprehensive definition. Using the new definition, the study proposed a fine-grained taxonomy of hate speech. In addition, the study proposed the main components of hate speech which can distinguish this concept from the general profanity. In the next phase, two hate speech datasets are created and an annotation scheme was developed based on the proposed taxonomy. Finally, using the annotated hate speech dataset, several multi-class, multi-label classification of hate speech are conducted to investigate the impact of the new annotation framework.

Advisor

Omar EL Ariss

Subject Categories

Computer Sciences | Physical Sciences and Mathematics

COinS