doi: 10.17586/2226-1494-2022-22-4-769-778

Light weight recommendation system for social networking analysis using a hybrid BERT-SVM classifier algorithm

N. Kiruthika, G. Thailambal

Read the full article  ';
Article in English

For citation:
Kiruthika N.S., Thailambal G. Light weight recommendation system for social networking analysis using a hybrid BERT-SVM classifier algorithm. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2022, vol. 22, no. 4, pp. 769–778. doi: 10.17586/2226-1494-2022-22-4-769-778

Social media platforms, such as Twitter, Instagram, and Facebook, have facilitated mass communication and connection. Due to the development as well as the advancement of social platforms, the spreading of fake news has increased. Many studies have been performed for detecting fake news with machine learning algorithms; but these existing methods had several difficulties, such as rapid propagation, access method and insignificant selection of features, and low accuracy of the text classification. Therefore, to overcome these issues, this paper proposed a hybrid Bidirectional Encoder Representations from Transformers — Support Vector Machine (BERT-SVM) model with a recommendation system that used to predict whether the information is fake or real. The proposed model consists of three phases: preprocessing, feature selection and classification. The dataset is gathered from Twitter social media related to COVID-19 real-time data. Preprocessing stage comprises Splitting, Stop word removal, Lemmatization and Spell correction. Term Frequency Inverse Document Frequency (TF-IDF) converter is utilized to extract the features and convert text to binary vectors. A hybrid BERT-SVM classification model is used to predict the data. Finally, the predicted data is compared with the preprocessed data. The proposed model is implemented in MATLAB software with several performance metrics carried out, and these parameters attained better performance: accuracy is 98 %, the error is 2 %, precision is 99 %, specificity is 99 %, and sensitivity is 98 %. Therefore the better effectiveness of the proposed model than existing approaches is shown. The proposed social networking analysis model provides effective fake news prediction that can be used to identify the Twitter comments, either real or fake.

Keywords: social networking analysis, fake news detection, TF/IDF, BERT, SVM, hybrid BERT-SVM

  1. Kaur S., Kumar P., Kumaraguru P. Automating fake news detection system using multi-level voting model. Soft Computing, 2020, vol. 24, no. 12, pp. 9049–9069.
  2. Kaliyar R.K., Goswami A., Narang P., Sinha S. FNDNet – a deep convolutional neural network for fake news detection. Cognitive Systems Research, 2020, vol. 61, pp. 32–44.
  3. Shim J.-S., Lee Y., Ahn H. A link2vec-based fake news detection model using web search results. Expert Systems with Applications, 2021, vol. 184, pp. 115491.
  4. Umer M., Imtiaz Z., Ullah S., Mehmood A., Choi G.S., On B.-W. Fake news stance detection using deep learning architecture (CNN-LSTM). IEEE Access, 2020, vol. 8, pp. 156695–156706.
  5. Hakak S., Alazab M., Khan S., Gadekallu T.R., Maddikunta P.K.R., Khan W.Z. An ensemble machine learning approach through effective feature extraction to classify fake news. Future Generation Computer Systems, 2021, vol. 117, pp. 47–58.
  6. Abdullah, Yasin A., Avan M.J., Shehzad M.F., Ashraf M. Fake news classification bimodal using convolutional neural network and long short-term memory. International Journal on Emerging Technologies, 2020, vol. 11, no. 5, pp. 209–212.
  7. Huang Y.-F., Chen P.-H. Fake news detection using an ensemble learning model based on self-adaptive harmony search algorithms. Expert Systems with Applications, 2020, vol. 159, pp. 113584.
  8. Paka W.S., Bansal R., Kaushik A., Sengupta S., Chakraborty T. Cross-SEAN: A cross-stitch semi-supervised neural attention model for COVID-19 fake news detection. Applied Soft Computing, 2021, vol. 107, pp. 107393.
  9. Nasir J.A., Khan O.S., Varlamis I. Fake news detection: A hybrid CNN-RNN based deep learning approach. International Journal of Information Management Data Insights, 2021, vol. 1, no. 1, pp. 100007.
  10. Sabeeh V., Zohdy M., Mollah A., Al Bashaireh R. Fake news detection on social media using deep learning and semantic knowledge sources. International Journal of Computer Science and Information Security (IJCSIS), 2020, vol. 18, no. 2, pp. 45-68.
  11. Bahad P., Saxena P., Kamal R. Fake news detection using bi-directional LSTM-recurrent neural network. Procedia Computer Science, 2019, vol. 165, pp. 74–82.
  12. Qaiser S., Ali R. Text mining: Use of TF-IDF to examine the relevance of words to documents. International Journal of Computer Applications, 2018, vol. 181, no. 1, pp. 25–29.
  13. Pota M., Ventura M., Catelli R., Esposito M. An effective BERT-based pipeline for Twitter sentiment analysis: a case study in Italian. Sensors, 2021, vol. 21, no. 1, pp. 133.
  14. Malla S., Alphonse P.J.A. COVID-19 outbreak: An ensemble pre-trained deep learning model for detecting informative tweets. Applied Soft Computing, 2021, vol. 107, pp. 107495.
  15. Goudjil M., Koudil M., Bedda M., Ghoggali N. A novel active learning method using SVM for text classification. International Journal of Automation and Computing, 2018, vol. 15, no. 3, pp. 290–298.
  16. Zhu J., Tian Z., Kübler S. UM-IU@LING at SemEval-2019 task 6: Identifying offensive tweets using BERT and SVMs. Proceedings of the 13th International Workshop on Semantic Evaluation, 2019, pp. 788–795.

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License
Copyright 2001-2022 ©
Scientific and Technical Journal
of Information Technologies, Mechanics and Optics.
All rights reserved.