doi: 10.17586/2226-1494-2023-23-6-1143-1151


Emotion analysis of social network data using cluster based probabilistic neural network with data parallelism

S. Starlin Jini, N. Chenthalir Indra


Read the full article  ';
Article in English

For citation:
Starlin Jini S., Chenthalir Indra N. Emotion analysis of social network data using cluster based probabilistic neural network with data parallelism. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2023, vol. 23, no. 6, pp. 1143–1151. doi: 10.17586/2226-1494-2023-23-6-1143-1151


Abstract
Social media contains a huge amount of data that is used by various organizations to study people’s emotions, thoughts and opinions. Users often use emoticons and emojis in addition to words to express their opinions on a topic. Emotion identification from text is no exception, but research in this area is still in its infancy. There are not many emotion annotated corpora available today. The complexity of the annotation task and the resulting inconsistent human comments are a challenge in developing emotion annotated corpora. Numerous studies have been carried out to solve these problems. The proposed methods were unable to perform emotion classification in a simple and cost-effective manner. To solve these problems, an efficient classification of emotions in recordings based on clustering is proposed. A dataset of social media posts is pre-processed to remove unwanted elements and then clustered. Semantic and emotional features are selected to improve classification efficiency. To reduce computation time and increase the efficiency of the system for predicting the probability of emotions, the concept of data parallelism in the classifier is proposed. The proposed model is tested using MATLAB software. The proposed model achieves 92 % accuracy on the annotated dataset and 94 % accuracy on the WASSA-2017 dataset. Performance comparison with other existing methods, such as Parallel K-Nearest Neighboring and Parallel Naive Byes Model methods, is performed. The comparison results showed that the proposed model is most effective in predicting emotions compared to existing models.

Keywords: emotions, clustering, feature extraction, probabilistic neural network and data parallelism

References
  1. Lee N., Ajanthan T., Torr P.H., Jaggi M. Understanding the effects of data parallelism and sparsity on neural network training. arXiv, 2021,arXiv:2003.11316. https://doi.org/10.48550/arXiv.2003.11316
  2. Xun Y., Zhang J., Qin X., Zhao X. FiDoop-DP: Data partitioning in frequent itemset mining on hadoop clusters. IEEE Transactions on Parallel and Distributed Systems, 2017, vol. 28, no. 1, pp. 101–114. https://doi.org/10.1109/tpds.2016.2560176
  3. Kulkarni M., Pingali K., Ramanarayanan G., Walter B., Bala K., Chew L.P. Optimistic parallelism benefits from data partitioning. ACM SIGPLAN Notices, 2008, vol. 43, no. 3, pp. 233–243. https://doi.org/10.1145/1353536.1346311
  4. Hernández Á.B., Perez M.S., Gupta S., Muntés-Mulero V. Using machine learning to optimize parallelism in big data applications. Future Generation Computer Systems, 2018, vol. 86, pp. 1076–1092. https://doi.org/10.1016/j.future.2017.07.003
  5. Karthick S. Semi supervised hierarchy forest clustering and KNN based metric learning technique for machine learning system. Journal of Advanced Research in Dynamical and Control Systems,2017, vol. 9, pp. 2679–2690.
  6. Chatterjee A., Gupta U., Chinnakotla M.K., Srikanth R., Galley M., Agrawal P. Understanding emotions in text using deep learning and big data. Computers in Human Behavior, 2019, vol. 93, pp. 309–317. https://doi.org/10.1016/j.chb.2018.12.029
  7. Marimuthu M., Rajalakshmi M., Phil M.C.A.M. A big data clustering algorithm for sentiment analysis to search the crucial statistics for decision making. International Journal for Research and Development in Technology (IJRDT), 2017, vol. 7, no. 2, pp. 132–138.
  8. Feng N., Xu S., Liang Y., Liu K. A probabilistic process neural network and its application in ECG classification. IEEE Access, 2019, vol. 7, pp. 50431–50439. https://doi.org/10.1109/access.2019.2910880
  9. He Q., Zhuang F., Li J., Shi Z. Parallel implementation of classification algorithms based on MapReduce. Lecture Notes in Computer Science, 2010, vol. 6401, pp. 655–662. https://doi.org/10.1007/978-3-642-16248-0_89
  10. Tang D., Wei F., Yang N., Zhou M., Liu T., Qin B. Learning sentiment-specific word embedding for twitter sentiment classification. Proc. of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2014, pp. 1555–1565. https://doi.org/10.3115/v1/p14-1146
  11. Mahmoodabadi M.J. Epidemic model analyzed via particle swarm optimization based homotopy perturbation method. Informatics in Medicine Unlocked, 2020, vol. 18, pp. 100293. https://doi.org/10.1016/j.imu.2020.100293
  12. Gupta V., Choudhary D., Tang P.T.P., Wei X., Wang X., Huang Y., Kejariwal A., Ramchandran K., Mahoney M.W. Training recommender systems at scale: Communication-efficient model and data parallelism. arXiv,2020,arXiv:2010.08899. https://doi.org/10.48550/arXiv.2010.08899
  13. Ye X., Zhao J., Chen Y., Guo L.J. Bayesian adversarial spectral clustering with unknown cluster number. IEEE Transactions on Image Processing, 2020, vol. 29, pp. 8506–8518. https://doi.org/10.1109/tip.2020.3016491
  14. Schneider S., Hirzel M., Gedik B., Wu K.L. Safe data parallelism for general streaming. IEEE Transactions on Computers, 2015, vol. 64, no. 2, pp. 504–517. https://doi.org/10.1109/tc.2013.221
  15. Alguliyev R.M., Aliguliyev R.M., Sukhostat L.V. Efficient algorithm for big data clustering on single machine. CAAI Transactions on Intelligence Technology, 2020, vol. 5, no. 1, pp. 9–14. https://doi.org/10.1049/trit.2019.0048
  16. Kinra A., Beheshti-Kashi S., Buch R., Nielsen T.A.S., Pereira F. Examining the potential of textual big data analytics for public policy decision-making: A case study with driverless cars in Denmark. Transport Policy, 2020, vol. 98, pp. 68–78. https://doi.org/10.1016/j.tranpol.2020.05.026
  17. Bolla S., Anandan R. Privacy preservation of data using efficient group cost optimization method with big data clustering. International Journal of Advanced Research in Engineering and Technology (IJARET), 2020, vol. 11, no. 11, pp. 748–760. https://doi.org/10.34218/IJARET.11.11.2020.071
  18. Fan W., Bouguila N. Spherical data clustering and feature selection through nonparametric Bayesian mixture models with von Mises distributions. Engineering Applications of Artificial Intelligence, 2020, vol. 94, pp. 103781. https://doi.org/10.1016/j.engappai.2020.103781
  19. Alotaibi N., Al-onazi B.B., Nour M.K., Mohamed A., Motwakel A., Mohammed G.P., Yaseen I., Rizwanullah M. Political optimizer with probabilistic neural network-based Arabic comparative opinion mining. Intelligent Automation & Soft Computing, 2023, vol. 36, no. 3, pp. 3121–3137. https://doi.org/10.32604/iasc.2023.033915


Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License
Copyright 2001-2024 ©
Scientific and Technical Journal
of Information Technologies, Mechanics and Optics.
All rights reserved.

Яндекс.Метрика