Applying bagging in finding network traffic anomalies

Rzayev Babyr T. , Lebedev Ilya S.

2021 , VOLUME 21, NUMBER 2 ( march-april )

ISSN 2226-1494 (print), ISSN 2500-0373 (online)

Publications

Editor-in-Chief

Nikiforov
Vladimir O.
D.Sc., Prof.

Partners

doi: 10.17586/2226-1494-2021-21-2-234-240

Applying bagging in finding network traffic anomalies

B. T. Rzayev, I. S. Lebedev

Read the full article

Article in Russian

For citation:

Rzayev B.T., Lebedev I.S. Applying bagging in finding network traffic anomalies. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2021, vol. 21, no 2, pp. 234–240 (in Russian). doi: 10.17586/2226-1494-2021-21-2-234-240

Abstract

The authors consider approaches to solving the problem of identifying anomalous situations in information and telecommunication systems, based on artificial intelligence methods that analyze the statistical information on traffic packets in various modes and states. We propose a method for detecting an anomalous situation based on the obtained tuples of values of network traffic packets by applying bagging classifying algorithms of machine learning. The network traffic is treated as a set of tuples of packet parameters, distributed over sample time. In contrast to the existing ones, the method does not require special data preparation; the errors in the classification of tuples of package values by individual classification algorithms are averaged by “collective” voting of the classifying algorithms. The given solution to the increase of the accuracy index makes it possible to use the classifying algorithms optimized for different types of events and anomalies, trained on various training samples in the form of tuples of network packet parameters. The difference between the algorithms is achieved by introducing an imbalance to the training sets. We describe an experiment conducted by using Naïve Bayes, Hoeffding Tree, J48, Random Forest, Random Tree and REP Tree classification algorithms of machine learning. The evaluation was performed on the open NSL-KDD dataset while searching for parasitic traffic. The paper presents the results of evaluation for each classifier individually and with bagging classifying algorithms. The method can be used in information security monitoring systems to analyze network traffic. The peculiarity of the proposed solution is the possibility of scaling and combining it by adding new classifying algorithms of machine learning. In the future, in the course of operation, it is possible to make changes in the composition of the classifying algorithms, which will improve the accuracy of the identification of potential destructive impact.

Keywords: bagging, anomaly detection, parasitic traffic, information security

References

Khan S., Yairi T. A review on the application of deep learning in system health management. Mechanical Systems and Signal Processing, 2018, vol. 107, pp. 241–265. doi: 10.1016/j.ymssp.2017.11.024
Salehi H., Burgueño R. Emerging artificial intelligence methods in structural engineering. Engineering Structures, 2018, vol. 171, pp. 170–189. doi: 10.1016/j.engstruct.2018.05.084
Gers F.A., Schmidhuber J., Cummins F. Learning to forget: Continual prediction with LSTM. Neural Computation, 2000, vol. 12, no. 10, pp. 2451–2471. doi: 10.1162/089976600300015015
Gokhale A., McDonals M.P., Drager S., McKeever W. A cyber physical systems perspective on the real-time and reliable dissemination of information in intelligent transportation systems. Network Protocols and Algorithms, 2010, vol. 2, no. 3, pp. 116–136. doi: 10.5296/npa.v2i3.480
Yuan K., Ling Q., Yin W. On the convergence of decentralized gradient descent. SIAM Journal on Optimization, 2016, vol. 26, no. 3, pp. 1835–1854. doi: 10.1137/130943170
Kwon D.W., Ko K., Vannucci M., Reddy A.L.N., Kim S. Wavelet methods for the detection of anomalies and their application to network traffic analysis. Quality and Reliability Engineering International, 2006, vol. 22, no. 8, pp. 953–969. doi: 10.1002/qre.781
Semenov V.V., Lebedev I.S., Sukhoparov M.E. Approach to classification of the information security state of elements for cyberphysical systems by applying side electromagnetic radiation. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2018, vol. 18, no. 1, pp. 98–105. (in Russian). doi: 10.17586/2226-1494-2018-18-1-98-105
Ahlgren B., Hidell M., Ngai E. Internet of things for smart cities: interoperability and open data. IEEE Internet Computing, 2016, vol. 20, no. 6, pp. 52–56. doi: 10.1109/MIC.2016.124
Genkin D., Shamir A., Tromer E. Acoustic cryptanalysis. Journal of Cryptology, 2017, vol. 30, no. 2, pp. 392–443. doi: 10.1007/s00145-015-9224-2
Semenov V.V., Lebedev I.S., Sukhoparov M.E., Salakhutdinova K.I. Application of an autonomous object behavior model to classify the cybersecurity state. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2019, vol. 11660, pp. 104–112. doi: 10.1007/978-3-030-30859-9_9
Palacios A., Sanchez L., Couso I. Combining Adaboost with preprocessing algorithms for extracting fuzzy rules from low quality data in possibly imbalanced problems. International Journal of Uncertainty, Fuzziness and Knowledge-based Systems, 2012, vol. 20, suppl. 2, pp. 51–71. doi: 10.1142/S0218488512400156
Ashibani Y., Mahmoud Q.H. Cyber physical systems security: analysis, challenges and solutions. Computer & Security, 2017, vol. 68, pp. 81–97. doi: 10.1016/j.cose.2017.04.005
Jin J., Gubbi J., Marusic S., Palaniswami M. An information framework for creating a smart city through internet of things. IEEE Internet of Things Journal, 2014, vol. 1, no. 2, pp. 112–121. doi: 10.1109/JIOT.2013.2296516
Sukhoparov M.E., Semenov V.V., Salakhutdinova K.I., Lebedev I.S. Identification of anomalous functioning of Industry 4.0 devices based on behavioral patterns. Information Security Problems. Computer Systems, 2020, no. 1, pp. 96–102. (in Russian)
Semenov V., Lebedev I., Sukhoparov M. Identification of the state of individual elements of cyber-physical systems based on external behavioral characteristics. Journal of Applied Informatics, 2018, vol. 13, no. 5(77),pp. 72–83. (in Russian)
Sukhoparov M.E., Lebedev I.S. Identification the information security status for the internet of things devices in information and telecommunication systems. Systems of Control, Communication and Security, 2020, no. 3, pp. 252–268. (in Russian). doi: 10.24411/2410-9916-2020-10310
Sukhoparov M.E., Lebedev I.S., Garanin A.V. Application of classifier sequences in the task of state analysis of Internet of Things devices. Computing, Telecommunications and Control, 2020, vol. 13,no. 3, pp. 44–54. doi: 10.18721/JCSTCS.13304
Ingre B., Yadav A. Performance Analysis of NSL-KDD dataset using ANN. Proc. 4^th International Conference on Signal Processing and Communication Engineering Systems (SPACES), 2015, pp. 92–96. doi: 10.1109/SPACES.2015.7058223
Dhanabal L., Shantharajah Dr. S.P. A Study on NSL-KDD dataset for intrusion detection system based on classification algorithms. International Journal of Advanced Research in Computer and Communication Engineering, 2015, vol. 4, no. 6, pp. 446–452. doi: 10.17148/IJARCCE.2015.4696
Vorontcov K.V. Lectures on algorithmic compositions. Available at: http://www.machinelearning.ru/wiki/images/0/0d/Voron-ML-Compositions.pdf (accessed: 03.12.2020). (in Russian)
D’yakonov A.G. Solution methods for classification problems with categorical attributes. Computational Mathematics and Modeling, 2015, vol. 26, no. 3, pp. 408–428. doi: 10.1007/s10598-015-9281-2
Zhou Z.-H. Ensemble Methods: Foundations and Algorithms. New York, CRC Press, 2012, 222 p.
Yu Y., Zhou Z.-H., Ting K.M. Cocktail ensemble for regression. Proc. 7^th IEEE International Conference on Data Mining (ICDM), 2007, pp. 721–726. doi: 10.1109/ICDM.2007.60
Zhou Z.-H., Feng J. Deep forest. National Science Review, 2019, vol. 6, no. 1, pp. 74–86. doi: 10.1093/nsr/nwy108
Pedersen T. A simple approach to building ensembles of naive bayesian classifiers for word sense disambiguation. NAACL 2000: Proc. of the 1^st North American chapter of the Association for Computational Linguistics Conference, 2000, pp. 63–69.
Kaftannikov I.L., Parasich A.V. Problems of training set’s formation in machine learning tasks. Bulletin of the South Ural State University. Series Computer Technology, Aotimatic Control, Radio Electronics, 2016, vol. 16, no. 3, pp. 15–24. (in Russian). doi: 10.14529/ctcr160302
Fawcett T. An introduction to ROC analysis. Pattern Recognition Letters, 2006, vol. 27, no. 8, pp. 861–874. doi: 10.1016/j.patrec.2005.10.010

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License