Vol. 2 No. 1 (2022): Issue 2
Articles

Improving the Applicability of Social Media Toxic Comments Prediction Across Diverse Data Platforms Using Residual Self-Attention-Based LSTM Combined with Transfer Learning

Jiahuai Ma
Department of Computer & Information Science, University of Florida, Florida, 32608, USA;
Zhaoyan Zhang
Beijing Kwai Technology Co., Ltd, Beijing, 100085, China
Kaixian Xu
Risk & Quant Analytics, BlackRock, New Jersey, 10001, USA
Yu Qiao
Khoury College of Computer Sciences, Northeastern University, Washington, 98109, USA

Published 2022-08-10

Keywords

  • Social media toxic comments prediction,
  • Transfer learning,
  • machine learning,
  • Residual connection

How to Cite

Ma, J., Zhang, Z., Xu, K., & Qiao, Y. (2022). Improving the Applicability of Social Media Toxic Comments Prediction Across Diverse Data Platforms Using Residual Self-Attention-Based LSTM Combined with Transfer Learning. Optimizations in Applied Machine Learning, 2(1). Retrieved from https://journal.mri-pub.com/index.php/OAML/article/view/79

Abstract

Toxic comments on social media, often involving hate speech and insults, pose significant challenges for online safety. While many studies have focused on detecting toxic comments within a single platform, cross-platform toxicity prediction remains underexplored. This task is particularly challenging due to linguistic differences and varying user behaviors across platforms, which reduce the effectiveness of models trained on one dataset when applied to another. To address these challenges, this paper proposes a Residual Self-Attention-Based LSTM framework with transfer learning. The model is first trained on a large source dataset (Twitter) and then fine-tuned on a smaller target dataset (YouTube). Residual connections ensure smooth gradient flow, while self-attention captures critical contextual features. Transfer learning enables the model to adapt to platform-specific nuances without retraining from scratch. Experiments show that the proposed approach significantly improves generalization across platforms, achieving higher precision, recall, and F1-scores compared to baseline methods. These results highlight the potential of combining advanced deep learning techniques with transfer learning for cross-platform toxicity detection, providing a foundation for future research in this area.

References

  1. Zaheri, S., Leath, J., & Stroud, D. (2020). Toxic comment classification. SMU Data Science Review, 3(1), 13.
  2. Van Aken, B., Risch, J., Krestel, R., & Löser, A. (2018). Challenges for toxic comment classification: An in-depth error analysis. arXiv preprint arXiv:1809.07572.
  3. Risch, J., & Krestel, R. (2020). Toxic comment detection in online discussions. Deep learning-based approaches for sentiment analysis, 85-109.
  4. Zhu, D., Gan, Y., & Chen, X. (2021). Domain Adaptation-Based Machine Learning Framework for Customer Churn Prediction Across Varing Distributions. Journal of Computational Methods in Engineering Applications, 1-14.
  5. Zhou, Z., Wu, J., Cao, Z., She, Z., Ma, J., & Zu, X. (2021, September). On-Demand Trajectory Prediction Based on Adaptive Interaction Car Following Model with Decreasing Tolerance. In 2021 International Conference on Computers and Automation (CompAuto) (pp. 67-72). IEEE.
  6. Wang, H., Li, J., & Xiong, S. (2008). Efficient join algorithms for distributed information integration based on XML. International Journal of Business Process Integration and Management, 3(4), 271-281.
  7. Brassard-Gourdeau, E., & Khoury, R. (2019, August). Subversive toxicity detection using sentiment information. In Proceedings of the Third Workshop on Abusive Language Online (pp. 1-10).
  8. Yee, K., Sebag, A. S., Redfield, O., Sheng, E., Eck, M., & Belli, L. (2022). A keyword based approach to understanding the over penalization of marginalized groups by English marginal abuse models on Twitter. arXiv preprint arXiv:2210.06351.
  9. Xiong, S., & Li, J. (2009, April). Optimizing many-to-many data aggregation in wireless sensor networks. In Asia-Pacific Web Conference (pp. 550-555). Berlin, Heidelberg: Springer Berlin Heidelberg.
  10. Xiong, S., & Li, J. (2010, June). An efficient algorithm for cut vertex detection in wireless sensor networks. In 2010 IEEE 30th International Conference on Distributed Computing Systems (pp. 368-377). IEEE.
  11. Li, J., & Xiong, S. (2010). Efficient Pr-skyline query processing and optimization in wireless sensor networks. Wireless Sensor Network, 2(11), 838.
  12. Aizawa, A. (2003). An information-theoretic perspective of tf–idf measures. Information Processing & Management, 39(1), 45-65.
  13. Qaiser, S., & Ali, R. (2018). Text mining: use of TF-IDF to examine the relevance of words to documents. International Journal of Computer Applications, 181(1), 25-29.
  14. Yu, L., Li, J., Cheng, S., & Xiong, S. (2011, April). Secure continuous aggregation via sampling-based verification in wireless sensor networks. In 2011 Proceedings IEEE INFOCOM (pp. 1763-1771). IEEE.
  15. Xiong, S., Li, J., Li, M., Wang, J., & Liu, Y. (2011, April). Multiple task scheduling for low-duty-cycled wireless sensor networks. In 2011 Proceedings IEEE INFOCOM (pp. 1323-1331). IEEE.
  16. Feng, Z., Xiong, S., Cao, D., Deng, X., Wang, X., Yang, Y., ... & Wu, G. (2015, March). Hrs: A hybrid framework for malware detection. In Proceedings of the 2015 ACM International Workshop on International Workshop on Security and Privacy Analytics (pp. 19-26).
  17. Jakkula, V. (2006). Tutorial on support vector machine (svm). School of EECS, Washington State University, 37(2.5), 3.
  18. Huang, S., Cai, N., Pacheco, P. P., Narrandes, S., Wang, Y., & Xu, W. (2018). Applications of support vector machine (SVM) learning in cancer genomics. Cancer genomics & proteomics, 15(1), 41-51.
  19. Breiman, L. (2001). Random forests. Machine learning, 45, 5-32.
  20. Biau, G. (2012). Analysis of a random forests model. The Journal of Machine Learning Research, 13(1), 1063-1095.
  21. Natekin, A., & Knoll, A. (2013). Gradient boosting machines, a tutorial. Frontiers in neurorobotics, 7, 21.
  22. Bentéjac, C., Csörgő, A., & Martínez-Muñoz, G. (2021). A comparative analysis of gradient boosting algorithms. Artificial Intelligence Review, 54, 1937-1967.
  23. Li, Z., Liu, F., Yang, W., Peng, S., & Zhou, J. (2021). A survey of convolutional neural networks: analysis, applications, and prospects. IEEE transactions on neural networks and learning systems, 33(12), 6999-7019.
  24. Gu, J., Wang, Z., Kuen, J., Ma, L., Shahroudy, A., Shuai, B., ... & Chen, T. (2018). Recent advances in convolutional neural networks. Pattern recognition, 77, 354-377.
  25. Medsker, L. R., & Jain, L. (2001). Recurrent neural networks. Design and Applications, 5(64-67), 2.
  26. Salehinejad, H., Sankar, S., Barfett, J., Colak, E., & Valaee, S. (2017). Recent advances in recurrent neural networks. arXiv preprint arXiv:1801.01078.
  27. Xiong, S., Zhang, H., Wang, M., & Zhou, N. (2022). Distributed Data Parallel Acceleration-Based Generative Adversarial Network for Fingerprint Generation. Innovations in Applied Engineering and Technology, 1-12.
  28. Ghojogh, B., & Ghodsi, A. (2020). Attention mechanism, transformers, BERT, and GPT: tutorial and survey.
  29. Qu, Y., Liu, P., Song, W., Liu, L., & Cheng, M. (2020, July). A text generation and prediction system: pre-training on new corpora using BERT and GPT-2. In 2020 IEEE 10th international conference on electronics information and emergency communication (ICEIEC) (pp. 323-326). IEEE.
  30. Topal, M. O., Bas, A., & van Heerden, I. (2021). Exploring transformers in natural language generation: Gpt, bert, and xlnet. arXiv preprint arXiv:2102.08036.
  31. Dai, W. (2022). Evaluation and improvement of carrying capacity of a traffic system. Innovations in Applied Engineering and Technology, 1-9.
  32. Zhao, Z., Ren, P., & Tang, M. (2022). Analyzing the Impact of Anti-Globalization on the Evolution of Higher Education Internationalization in China. Journal of Linguistics and Education Research, 5(2), 15-31.
  33. Lei, J. (2022). Green Supply Chain Management Optimization Based on Chemical Industrial Clusters. Innovations in Applied Engineering and Technology, 1-17.
  34. Lei, J. (2022). Efficient Strategies on Supply Chain Network Optimization for Industrial Carbon Emission Reduction. Journal of Computational Methods in Engineering Applications, 1-11.
  35. Dai, W. (2021). Safety evaluation of traffic system with historical data based on Markov process and deep-reinforcement learning. Journal of Computational Methods in Engineering Applications, 1-14.
  36. Xiong, S., Zhang, H., & Wang, M. (2022). Ensemble Model of Attention Mechanism-Based DCGAN and Autoencoder for Noised OCR Classification. Journal of Electronic & Information Systems, 4(1), 33-41.
  37. Shtovba, S., Shtovba, O., & Petrychko, M. (2019). Detection of social network toxic comments with usage of syntactic dependencies in the sentences. In CEUR Workshop Proceedings (pp. 313-323).
  38. Saif, M. A., Medvedev, A. N., Medvedev, M. A., & Atanasova, T. (2018, December). Classification of online toxic comments using the logistic regression and neural networks models. In AIP conference proceedings (Vol. 2048, No. 1). AIP Publishing.
  39. Rupapara, V., Rustam, F., Shahzad, H. F., Mehmood, A., Ashraf, I., & Choi, G. S. (2021). Impact of SMOTE on imbalanced text features for toxic comments classification using RVVC model. IEEE Access, 9, 78621-78634.
  40. Haralabopoulos, G., Anagnostopoulos, I., & McAuley, D. (2020). Ensemble deep learning for multilabel binary classification of user-generated content. Algorithms, 13(4), 83.
  41. Chowdhary, N. S., & Pandit, A. A. (2018). Fake review detection using classification. Int. J. Comput. Appl, 180(50), 16-21.
  42. Cardoso, E. F., Silva, R. M., & Almeida, T. A. (2018). Towards automatic filtering of fake reviews. Neurocomputing, 309, 106-116.
  43. Ventirozos, F. K., Varlamis, I., & Tsatsaronis, G. (2018). Detecting aggressive behavior in discussion threads using text mining. In Computational Linguistics and Intelligent Text Processing: 18th International Conference, CICLing 2017, Budapest, Hungary, April 17–23, 2017, Revised Selected Papers, Part II 18 (pp. 420-431). Springer International Publishing.
  44. Machová, K., Mach, M., & Demková, G. (2020). Modelling of the fake posting recognition in on-line media using machine learning. In SOFSEM 2020: Theory and Practice of Computer Science: 46th International Conference on Current Trends in Theory and Practice of Informatics, SOFSEM 2020, Limassol, Cyprus, January 20–24, 2020, Proceedings 46 (pp. 667-675). Springer International Publishing.
  45. Mestry, S., Singh, H., Chauhan, R., Bisht, V., & Tiwari, K. (2019, April). Automation in social networking comments with the help of robust fasttext and cnn. In 2019 1st International Conference on Innovations in Information and Communication Technology (ICIICT) (pp. 1-4). IEEE.
  46. Anand, M., & Eswari, R. (2019, March). Classification of abusive comments in social media using deep learning. In 2019 3rd international conference on computing methodologies and communication (ICCMC) (pp. 974-977). IEEE.
  47. Srivastava, S., Khurana, P., & Tewari, V. (2018, August). Identifying aggression and toxicity in comments using capsule network. In Proceedings of the first workshop on trolling, aggression and cyberbullying (TRAC-2018) (pp. 98-105).
  48. Georgakopoulos, S. V., Tasoulis, S. K., Vrahatis, A. G., & Plagianakos, V. P. (2018, July). Convolutional neural networks for toxic comment classification. In Proceedings of the 10th hellenic conference on artificial intelligence (pp. 1-6).
  49. Yu, L., Li, J., Cheng, S., Xiong, S., & Shen, H. (2013). Secure continuous aggregation in wireless sensor networks. IEEE Transactions on Parallel and Distributed Systems, 25(3), 762-774.
  50. Xiong, S., Yu, L., Shen, H., Wang, C., & Lu, W. (2012, March). Efficient algorithms for sensor deployment and routing in sensor networks for network-structured environment monitoring. In 2012 Proceedings IEEE INFOCOM (pp. 1008-1016). IEEE.