Evaluation of Different Machine Learning and Deep Learning Techniques for Hate Speech Detection

Abstract

Detecting online hate speech is important for creating safer online spaces. In this paper, we evaluate the performance of several machine learning (ML) and deep learning (DL) models in detecting hate speech on three different datasets. We evaluate the performance of the traditional ML algorithms Support Vector Machines (SVM), Naive Bayes, Decision Trees, Random Forests, and Logistic Regression. We also evaluate the performance of deep learning Convolutional Neural Networks (CNN), Long Short Term Memory (LSTM), and the BERT pre-trained transformer model. Our experiments show that BERT outperformed all other models with F-1 scores of 90.6% on one dataset and 89.7% and 88.2% on the other two datasets. After that, CNN and LSTM outperformed the traditional ML algorithms with F1-scores over 80% on all three datasets. Among the traditional ML models, SVM performed best with the highest F1-score of 75.6%.

Department(s)

Computer Science

Document Type

Conference Proceeding

DOI

10.1145/3603287.3651218

Keywords

BERT, deep learning, hate speech, machine learning, text classification

Publication Date

4-18-2024

Journal Title

Proceedings of the 2024 ACM Southeast Conference Acmse 2024

Share

COinS