Efficient Hate Speech Detection: Evaluating 38 Models from Traditional Methods to Transformers
Abstract
The proliferation of hate speech on social media necessitates automated detection systems that balance accuracy with computational efficiency. This study evaluates 38 model configurations in detecting hate speech across datasets ranging from 6.5K to 451K samples. We analyze transformer architectures (e.g., BERT, RoBERTa, DistilBERT), deep neural networks (e.g., CNN, LSTM, GRU, Hierarchical Attention Networks), and traditional machine learning methods (e.g., SVM, CatBoost, Random Forest). Our results show that transformers, particularly RoBERTa, consistently achieve superior performance with accuracy and F1-scores exceeding 90%. Among deep learning approaches, Hierarchical Attention Networks yield the best results, while traditional methods like CatBoost and SVM remain competitive, achieving F1-scores above 88% with significantly lower computational costs. Additionally, our analysis highlights the importance of dataset characteristics, with balanced, moderately sized unprocessed datasets outperforming larger, preprocessed datasets. These findings offer valuable insights for developing efficient and effective hate speech detection systems.
Department(s)
Computer Science
Document Type
Conference Proceeding
DOI
10.1145/3696673.3723061
Keywords
Deep Learning, Hate Speech Detection, Machine Learning, Natural Language Processing, RoBERTa, Transformer-based Classification
Publication Date
5-8-2025
Recommended Citation
Abusaqer, Mahmoud; Saquer, Jamil M.; and Shatnawi, Hazim, "Efficient Hate Speech Detection: Evaluating 38 Models from Traditional Methods to Transformers" (2025). Faculty Scholarship. 148.
https://bearworks.missouristate.edu/articles00/148
Journal Title
Acmse 2025 Proceedings of the 2025 ACM Southeast Conference