The Automatic Multi-Class Classification of Hate Speech on Twitter
Document Type
Thesis
Degree Name
Master of Science (MS)
Department
Computer Science and Info Sys
Date of Award
Fall 2021
Abstract
The spread of hate speech on social media is becoming a significant concern. To address this concern, various scholars from diverse disciplines such as sociology, legal studies and computer science have attempted to define, analyze and detect hate speech. However, hate speech has not been adequately addressed and analyzed as a sociolinguistic phenomenon. Therefore, the aim of this study is to shed more light on understanding hate speech as a sociolinguistic concept. To achieve this goal, three main phases have been performed. First, the study incorporates the theory of speech acts along with the existing academic and non-academic definitions of hate speech along to propose a more comprehensive definition. Using the new definition, the study proposed a fine-grained taxonomy of hate speech. In addition, the study proposed the main components of hate speech which can distinguish this concept from the general profanity. In the next phase, two hate speech datasets are created and an annotation scheme was developed based on the proposed taxonomy. Finally, using the annotated hate speech dataset, several multi-class, multi-label classification of hate speech are conducted to investigate the impact of the new annotation framework.
Advisor
Omar EL Ariss
Subject Categories
Computer Sciences | Physical Sciences and Mathematics
Recommended Citation
Razzaghi, Masoumeh, "The Automatic Multi-Class Classification of Hate Speech on Twitter" (2021). Electronic Theses & Dissertations. 539.
https://digitalcommons.tamuc.edu/etd/539