Menu
Publications
2024
2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
Editor-in-Chief
Nikiforov
Vladimir O.
D.Sc., Prof.
Partners
doi: 10.17586/2226-1494-2023-23-3-500-505
Attacker group detection method based on HTTP payload analysis
Read the full article ';
Article in Russian
For citation:
Abstract
For citation:
Pavlov A.V., Voloshina N.V. Attacker group detection method based on HTTP payload analysis. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2023, vol. 23, no. 3, pp. 500–505 (in Russian). doi: 10.17586/2226-1494-2023-23-3-500-505
Abstract
Attacks on web applications are a frequent vector of attack on information resources by attackers of various skill levels. Such attacks can be investigated through analysis of HTTP requests made by the attackers. The possibility of identifying groups of attackers based on the analysis of the payload of HTTP requests marked by IDS as attack events has been studied. The identification of groups of attackers improves the work of security analysts investigating and responding to incidents, reduces the impact of alert fatigue in the analysis of security events, and also helps in identifying attack patterns and resources of intruders. Identification of groups of attackers within the framework of the proposed method is performed based on the sequence of stages. At the first stage, requests are split into tokens by a regular expression based on the features of the HTTP protocol and attacks that are often encountered and detected by intrusion detection systems. Then the tokens are weighted using the TF-IDF method, which allows to further give a greater contribution when comparing requests to the coincidence of rare words. At the next stage the main core of requests is separated based on their distance from the origin. Thus, requests not containing rare words, the coincidence of which allows us to talk about the connectedness of events, are separated. Manhattan distance is used to determine the distance. Finally, clustering is carried out using the DBSCAN method. It is shown that HTTP request payload data can be used to identify groups of attackers. An efficient method of tokenization, weighting and clustering of the considered data is proposed. The use of the DBSCAN method for clustering within the framework of the method is proposed. The homogeneity, completeness and V-measure of clustering obtained by various methods on the CPTC-2018 dataset were evaluated. The proposed method allows obtaining a clustering of events with high homogeneity and sufficient completeness. It is proposed to combine the resulting clustering with clusters obtained by other methods with high clustering homogeneity to obtain a high completeness metric and V-measure while maintaining high homogeneity. The proposed method can be used in the work of security analysts in SOC, CERT and CSIRT, both in defending against intrusions including APT and in collecting data on attackers’ techniques and tactics. The method makes it possible to identify patterns of traces of tools used by attackers, which allows attribution of attacks.
Keywords: attacker groups, complex attacks, intrusion detection, alert correlation
References
References
-
Hassan W., Guo S., Li D., Chen Z., Jee K., Li Z., Bates A. NoDoze: Combatting threat alert fatigue with automated provenance triage. Proc. of the 2019 Network and Distributed System Security Symposium, 2019. https://doi.org/10.14722/ndss.2019.23349
-
Pavlov A., Voloshina N. Analysis of IDS alert correlation techniques for attacker group recognition in distributed systems. Lecture Notes in Computer Science, 2020, vol. 12525, pp. 32–42. https://doi.org/10.1007/978-3-030-65726-0_4
-
Kotenko I., Gaifulina D., Zelichenok I. Systematic literature review of security event correlation methods. IEEE Access, 2022, vol. 10, pp. 43387–43420. https://doi.org/10.1109/access.2022.3168976
-
Mirheidari S.A., Arshad S., Jalili R. Alert correlation algorithms: A survey and taxonomy. Lecture Notes in Computer Science, 2013, vol. 8300, pp. 183–197. https://doi.org/10.1007/978-3-319-03584-0_14
-
Navarro J., Deruyver A., Parrend P. A systematic survey on multi-step attack detection.Computers & Security, 2018, vol. 76,pp. 214–249. https://doi.org/10.1016/j.cose.2018.03.001
-
Zhan J., Liao X., Bao Y., Gan L., Tan Z., Zhang M., He R., Lu J. An effective feature representation of web log data by leveraging byte pair encoding and TF-IDF. Proc. of the ACM Turing Celebration Conference - China (ACM TURC '19), 2019, pp. 62. https://doi.org/10.1145/3321408.3321568
-
Qi B., Shi Z., Wang Y., Wang J., Wang Q., Jiang J. BotTokenizer: Exploring network tokens of HTTP-based botnet using malicious network traces. Lecture Notes in Computer Science, 2018,vol. 10726, pp. 383–403. https://doi.org/10.1007/978-3-319-75160-3_23
-
Chen R.-C., Chen S.-P. Intrusion detection using a hybrid support vector machine based on entropy and TF-IDF. International Journal of Innovative Computing, Information & Control (IJICIC), 2008, vol. 4, no. 2, pp. 413–424.
-
Pavlov A.V. Analysis of network interaction of modern exploits. Information Technologies, 2022, vol. 28, no. 2, pp. 75–80. (in Russian). https://doi.org/10.17587/it.28.75-80
-
Salton G., Buckley C. Term-weighting approaches in automatic text retrieval.Information Processing & Management, 1988, vol. 24, no. 5, pp. 513–523. https://doi.org/10.1016/0306-4573(88)90021-0
-
Aggarwal C., Hinneburg A., Keim D.On the surprising behavior of distance metrics in high dimensional space. Lecture Notes in Computer Science, 2001, vol. 1973, pp. 420–434. https://doi.org/10.1007/3-540-44503-x_27
-
Muniah N., Pelletier J., Su S.-H., Yang S.J., Meneely A. A cybersecurity dataset derived from the national collegiate penetration testing competition. Proc. of the HICSS Symposium on Cybersecurity Big Data Analytics, 2019.
-
Rosenberg A., Hirschberg J. V-Measure: A conditional entropy-based external cluster evaluation measure. Proc. of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), 2007, pp. 410–420.
-
Shi J., Malik J. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, vol. 22, no. 8, pp. 888–905. https://doi.org/10.1109/34.868688
-
Ester M., Kriegel H.-P., Sander J., Xu X. A density-based algorithm for discovering clusters in large spatial databases with noise. Proc. of the 2nd International Conference on Knowledge Discovery and Data Mining (KDD), 1996, pp. 226–231.