Light weight recommendation system for social networking analysis using a hybrid BERT-SVM classifier algorithm

Nallichery Subramanian Kiruthika, Ganapathy Thailambal

2022 , VOLUME 22, NUMBER 4 ( July-August )

ISSN 2226-1494 (print), ISSN 2500-0373 (online)

Publications

Editor-in-Chief

Nikiforov
Vladimir O.
D.Sc., Prof.

Partners

doi: 10.17586/2226-1494-2022-22-4-769-778

Light weight recommendation system for social networking analysis using a hybrid BERT-SVM classifier algorithm

N. Kiruthika, G. Thailambal

Read the full article

Article in English

For citation:

Kiruthika N.S., Thailambal G. Light weight recommendation system for social networking analysis using a hybrid BERT-SVM classifier algorithm. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2022, vol. 22, no. 4, pp. 769–778. doi: 10.17586/2226-1494-2022-22-4-769-778

Abstract

Social media platforms, such as Twitter, Instagram, and Facebook, have facilitated mass communication and connection. Due to the development as well as the advancement of social platforms, the spreading of fake news has increased. Many studies have been performed for detecting fake news with machine learning algorithms; but these existing methods had several difficulties, such as rapid propagation, access method and insignificant selection of features, and low accuracy of the text classification. Therefore, to overcome these issues, this paper proposed a hybrid Bidirectional Encoder Representations from Transformers — Support Vector Machine (BERT-SVM) model with a recommendation system that used to predict whether the information is fake or real. The proposed model consists of three phases: preprocessing, feature selection and classification. The dataset is gathered from Twitter social media related to COVID-19 real-time data. Preprocessing stage comprises Splitting, Stop word removal, Lemmatization and Spell correction. Term Frequency Inverse Document Frequency (TF-IDF) converter is utilized to extract the features and convert text to binary vectors. A hybrid BERT-SVM classification model is used to predict the data. Finally, the predicted data is compared with the preprocessed data. The proposed model is implemented in MATLAB software with several performance metrics carried out, and these parameters attained better performance: accuracy is 98 %, the error is 2 %, precision is 99 %, specificity is 99 %, and sensitivity is 98 %. Therefore the better effectiveness of the proposed model than existing approaches is shown. The proposed social networking analysis model provides effective fake news prediction that can be used to identify the Twitter comments, either real or fake.

Keywords: social networking analysis, fake news detection, TF/IDF, BERT, SVM, hybrid BERT-SVM

References

Kaur S., Kumar P., Kumaraguru P. Automating fake news detection system using multi-level voting model. Soft Computing, 2020, vol. 24, no. 12, pp. 9049–9069. https://doi.org/10.1007/s00500-019-04436-y
Kaliyar R.K., Goswami A., Narang P., Sinha S. FNDNet – a deep convolutional neural network for fake news detection. Cognitive Systems Research, 2020, vol. 61, pp. 32–44. https://doi.org/10.1016/j.cogsys.2019.12.005
Shim J.-S., Lee Y., Ahn H. A link2vec-based fake news detection model using web search results. Expert Systems with Applications, 2021, vol. 184, pp. 115491. https://doi.org/10.1016/j.eswa.2021.115491
Umer M., Imtiaz Z., Ullah S., Mehmood A., Choi G.S., On B.-W. Fake news stance detection using deep learning architecture (CNN-LSTM). IEEE Access, 2020, vol. 8, pp. 156695–156706. https://doi.org/10.1109/ACCESS.2020.3019735
Hakak S., Alazab M., Khan S., Gadekallu T.R., Maddikunta P.K.R., Khan W.Z. An ensemble machine learning approach through effective feature extraction to classify fake news. Future Generation Computer Systems, 2021, vol. 117, pp. 47–58. https://doi.org/10.1016/j.future.2020.11.022
Abdullah, Yasin A., Avan M.J., Shehzad M.F., Ashraf M. Fake news classification bimodal using convolutional neural network and long short-term memory. International Journal on Emerging Technologies, 2020, vol. 11, no. 5, pp. 209–212.
Huang Y.-F., Chen P.-H. Fake news detection using an ensemble learning model based on self-adaptive harmony search algorithms. Expert Systems with Applications, 2020, vol. 159, pp. 113584. https://doi.org/10.1016/j.eswa.2020.113584
Paka W.S., Bansal R., Kaushik A., Sengupta S., Chakraborty T. Cross-SEAN: A cross-stitch semi-supervised neural attention model for COVID-19 fake news detection. Applied Soft Computing, 2021, vol. 107, pp. 107393. https://doi.org/10.1016/j.asoc.2021.107393
Nasir J.A., Khan O.S., Varlamis I. Fake news detection: A hybrid CNN-RNN based deep learning approach. International Journal of Information Management Data Insights, 2021, vol. 1, no. 1, pp. 100007. https://doi.org/10.1016/j.jjimei.2020.100007
Sabeeh V., Zohdy M., Mollah A., Al Bashaireh R. Fake news detection on social media using deep learning and semantic knowledge sources. International Journal of Computer Science and Information Security (IJCSIS), 2020, vol. 18, no. 2, pp. 45-68.
Bahad P., Saxena P., Kamal R. Fake news detection using bi-directional LSTM-recurrent neural network. Procedia Computer Science, 2019, vol. 165, pp. 74–82. https://doi.org/10.1016/j.procs.2020.01.072
Qaiser S., Ali R. Text mining: Use of TF-IDF to examine the relevance of words to documents. International Journal of Computer Applications, 2018, vol. 181, no. 1, pp. 25–29. https://doi.org/10.5120/ijca2018917395
Pota M., Ventura M., Catelli R., Esposito M. An effective BERT-based pipeline for Twitter sentiment analysis: a case study in Italian. Sensors, 2021, vol. 21, no. 1, pp. 133. https://doi.org/10.3390/s21010133
Malla S., Alphonse P.J.A. COVID-19 outbreak: An ensemble pre-trained deep learning model for detecting informative tweets. Applied Soft Computing, 2021, vol. 107, pp. 107495. https://doi.org/10.1016/j.asoc.2021.107495
Goudjil M., Koudil M., Bedda M., Ghoggali N. A novel active learning method using SVM for text classification. International Journal of Automation and Computing, 2018, vol. 15, no. 3, pp. 290–298. https://doi.org/10.1007/s11633-015-0912-z
Zhu J., Tian Z., Kübler S. UM-IU@LING at SemEval-2019 task 6: Identifying offensive tweets using BERT and SVMs. Proceedings of the 13^th International Workshop on Semantic Evaluation, 2019, pp. 788–795. https://doi.org/10.18653/v1/s19-2138

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License