doi: 10.17586/2226-1494-2021-21-2-234-240


Applying bagging in finding network traffic anomalies

B. T. Rzayev, I. S. Lebedev


Read the full article  ';
Article in Russian

For citation:

Rzayev B.T., Lebedev I.S. Applying bagging in finding network traffic anomalies. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2021, vol. 21, no 2, pp. 234–240 (in Russian). doi: 10.17586/2226-1494-2021-21-2-234-240



Abstract

The authors consider approaches to solving the problem of identifying anomalous situations in information and telecommunication systems, based on artificial intelligence methods that analyze the statistical information on traffic packets in various modes and states. We propose a method for detecting an anomalous situation based on the obtained tuples of values of network traffic packets by applying bagging classifying algorithms of machine learning. The network traffic is treated as a set of tuples of packet parameters, distributed over sample time. In contrast to the existing ones, the method does not require special data preparation; the errors in the classification of tuples of package values by individual classification algorithms are averaged by “collective” voting of the classifying algorithms. The given solution to the increase of the accuracy index makes it possible to use the classifying algorithms optimized for different types of events and anomalies, trained on various training samples in the form of tuples of network packet parameters. The difference between the algorithms is achieved by introducing an imbalance to the training sets. We describe an experiment conducted by using Naïve Bayes, Hoeffding Tree, J48, Random Forest, Random Tree and REP Tree classification algorithms of machine learning. The evaluation was performed on the open NSL-KDD dataset while searching for parasitic traffic. The paper presents the results of evaluation for each classifier individually and with bagging classifying algorithms. The method can be used in information security monitoring systems to analyze network traffic. The peculiarity of the proposed solution is the possibility of scaling and combining it by adding new classifying algorithms of machine learning. In the future, in the course of operation, it is possible to make changes in the composition of the classifying algorithms, which will improve the accuracy of the identification of potential destructive impact.


Keywords: bagging, anomaly detection, parasitic traffic, information security

References
  1. Khan S., Yairi T. A review on the application of deep learning in system health management. Mechanical Systems and Signal Processing, 2018, vol. 107, pp. 241–265. doi: 10.1016/j.ymssp.2017.11.024
  2. Salehi H., Burgueño R. Emerging artificial intelligence methods in structural engineering. Engineering Structures, 2018, vol. 171, pp. 170–189. doi: 10.1016/j.engstruct.2018.05.084
  3. Gers F.A., Schmidhuber J., Cummins F. Learning to forget: Continual prediction with LSTM. Neural Computation, 2000, vol. 12, no. 10, pp. 2451–2471. doi: 10.1162/089976600300015015
  4. Gokhale A., McDonals M.P., Drager S., McKeever W. A cyber physical systems perspective on the real-time and reliable dissemination of information in intelligent transportation systems. Network Protocols and Algorithms, 2010, vol. 2, no. 3, pp. 116–136. doi: 10.5296/npa.v2i3.480
  5. Yuan K., Ling Q., Yin W. On the convergence of decentralized gradient descent. SIAM Journal on Optimization, 2016, vol. 26, no. 3, pp. 1835–1854. doi: 10.1137/130943170
  6. Kwon D.W., Ko K., Vannucci M., Reddy A.L.N., Kim S. Wavelet methods for the detection of anomalies and their application to network traffic analysis. Quality and Reliability Engineering International, 2006, vol. 22, no. 8, pp. 953–969. doi: 10.1002/qre.781
  7. Semenov V.V., Lebedev I.S., Sukhoparov M.E. Approach to classification of the information security state of elements for cyberphysical systems by applying side electromagnetic radiation. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2018, vol. 18, no. 1, pp. 98–105. (in Russian). doi: 10.17586/2226-1494-2018-18-1-98-105
  8. Ahlgren B., Hidell M., Ngai E. Internet of things for smart cities: interoperability and open data. IEEE Internet Computing, 2016, vol. 20, no. 6, pp. 52–56. doi: 10.1109/MIC.2016.124
  9. Genkin D., Shamir A., Tromer E. Acoustic cryptanalysis. Journal of Cryptology, 2017, vol. 30, no. 2, pp. 392–443. doi: 10.1007/s00145-015-9224-2
  10. Semenov V.V., Lebedev I.S., Sukhoparov M.E., Salakhutdinova K.I. Application of an autonomous object behavior model to classify the cybersecurity state. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2019, vol. 11660, pp. 104–112. doi: 10.1007/978-3-030-30859-9_9
  11. Palacios A., Sanchez L., Couso I. Combining Adaboost with preprocessing algorithms for extracting fuzzy rules from low quality data in possibly imbalanced problems. International Journal of Uncertainty, Fuzziness and Knowledge-based Systems, 2012, vol. 20, suppl. 2, pp. 51–71. doi: 10.1142/S0218488512400156
  12. Ashibani Y., Mahmoud Q.H. Cyber physical systems security: analysis, challenges and solutions. Computer & Security, 2017, vol. 68, pp. 81–97. doi: 10.1016/j.cose.2017.04.005
  13. Jin J., Gubbi J., Marusic S., Palaniswami M. An information framework for creating a smart city through internet of things. IEEE Internet of Things Journal, 2014, vol. 1, no. 2, pp. 112–121. doi: 10.1109/JIOT.2013.2296516
  14. Sukhoparov M.E., Semenov V.V., Salakhutdinova K.I., Lebedev I.S. Identification of anomalous functioning of Industry 4.0 devices based on behavioral patterns. Information Security Problems. Computer Systems, 2020, no. 1, pp. 96–102. (in Russian)
  15. Semenov V., Lebedev I., Sukhoparov M. Identification of the state of individual elements of cyber-physical systems based on external behavioral characteristics. Journal of Applied Informatics, 2018, vol. 13, no. 5(77),pp. 72–83. (in Russian)
  16. Sukhoparov M.E., Lebedev I.S. Identification the information security status for the internet of things devices in information and telecommunication systems. Systems of Control, Communication and Security, 2020, no. 3, pp. 252–268. (in Russian). doi: 10.24411/2410-9916-2020-10310
  17. Sukhoparov M.E., Lebedev I.S., Garanin A.V. Application of classifier sequences in the task of state analysis of Internet of Things devices. Computing, Telecommunications and Control, 2020, vol. 13,no. 3, pp. 44–54. doi: 10.18721/JCSTCS.13304
  18. Ingre B., Yadav A. Performance Analysis of NSL-KDD dataset using ANN. Proc. 4th International Conference on Signal Processing and Communication Engineering Systems (SPACES), 2015, pp. 92–96. doi: 10.1109/SPACES.2015.7058223
  19. Dhanabal L., Shantharajah Dr. S.P. A Study on NSL-KDD dataset for intrusion detection system based on classification algorithms. International Journal of Advanced Research in Computer and Communication Engineering, 2015, vol. 4, no. 6, pp. 446–452. doi: 10.17148/IJARCCE.2015.4696
  20. Vorontcov K.V. Lectures on algorithmic compositions. Available at: http://www.machinelearning.ru/wiki/images/0/0d/Voron-ML-Compositions.pdf (accessed: 03.12.2020). (in Russian)
  21. D’yakonov A.G. Solution methods for classification problems with categorical attributes. Computational Mathematics and Modeling, 2015, vol. 26, no. 3, pp. 408–428. doi: 10.1007/s10598-015-9281-2
  22. Zhou Z.-H. Ensemble Methods: Foundations and Algorithms. New York, CRC Press, 2012, 222 p.
  23. Yu Y., Zhou Z.-H., Ting K.M. Cocktail ensemble for regression. Proc. 7th IEEE International Conference on Data Mining (ICDM), 2007, pp. 721–726. doi: 10.1109/ICDM.2007.60
  24. Zhou Z.-H., Feng J. Deep forest. National Science Review, 2019, vol. 6, no. 1, pp. 74–86. doi: 10.1093/nsr/nwy108
  25. Pedersen T. A simple approach to building ensembles of naive bayesian classifiers for word sense disambiguation. NAACL 2000: Proc. of the 1st North American chapter of the Association for Computational Linguistics Conference, 2000, pp. 63–69.
  26. Kaftannikov I.L., Parasich A.V. Problems of training set’s formation in machine learning tasks. Bulletin of the South Ural State University. Series Computer Technology, Aotimatic Control, Radio Electronics, 2016, vol. 16, no. 3, pp. 15–24. (in Russian). doi: 10.14529/ctcr160302
  27. Fawcett T. An introduction to ROC analysis. Pattern Recognition Letters, 2006, vol. 27, no. 8, pp. 861–874. doi: 10.1016/j.patrec.2005.10.010


Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License
Copyright 2001-2024 ©
Scientific and Technical Journal
of Information Technologies, Mechanics and Optics.
All rights reserved.

Яндекс.Метрика