A Comparative Analysis of Transformer and Traditional ML Models for Cyberbullying Detection on Twitter (now X)

Abstract

Cyberbullying on social media poses critical risks to mental health and public safety. This paper investigates advanced computational approaches for detecting and classifying cyberbullying in a large Twitter dataset of 47,692 tweets, labeled into six categories. We compare three transformer-based models (GPT-3.5, BERT, and RoBERTa) against three traditional machine learning algorithms (Naïve Bayes, SVM, and Random Forest), evaluating accuracy, precision, recall, F1-score, and computational efficiency. Our results indicate that RoBERTa achieves the highest overall performance (87-88% accuracy) but at a higher computational cost (13 hours on CPU), while Random Forest offers a strong balance between speed and performance (85.36% accuracy in 83 seconds). In contrast, the experiment using GPT-3.5 in a batched, zero-shot configuration achieved accuracy of 25.41%, an F1-score of 23.61%, and elapsed time of 5.14 hours, highlighting the challenges of applying generative models to cyberbullying detection without fine-tuning. These findings inform model selection for real-world deployment of cyberbullying detection systems, illuminating the trade-offs between transformer-based and traditional methods for automated social media monitoring.

Department(s)

Computer Science

Document Type

Conference Proceeding

DOI

10.1109/COMPSAC65507.2025.00216

Keywords

cyberbullying detection, deep learning, machine learning, natural language processing, text classification, transformers

Publication Date

1-1-2025

Journal Title

Proceedings 2025 IEEE 49th Annual Computers Software and Applications Conference Compsac 2025

Share

COinS