Nikiforov
Vladimir O.
D.Sc., Prof.
doi: 10.17586/2226-1494-2022-22-2-262-268
Lightweight approach for malicious domain detection using machine learning
Read the full article ';
For citation:
Abstract
The web-based attacks use the vulnerabilities of the end users and their system and perform malicious activities such as stealing sensitive information, injecting malwares, redirecting to malicious sites without their knowledge. Malicious website links are spread through social media posts, emails and messages. The victim can be an individual or an organization and it creates huge money loss every year. Recent Internet Security report states that 83 % of systems in the internet are infected by the malware during the last 12 months due to the users who do not aware of the malicious URL (Uniform Resource Locators) and its impacts. There are some methods to detect and prevent the access malicious domain name in the internet. Blacklist-based approaches, heuristic-based methods, and machine/deep learning-based methods are the three categories. This study provides a machine learning-based lightweight solution to classify malicious domain names. Most of the existing research work is focused on increasing the number of features for better classification accuracy. But the proposed approach uses fewer number of features which include lexical, content based, bag of words, popularity features for malicious domain classification. Result of the experiment shows that the proposed approach performs better than the existing one.
References
-
Warburton D. 2020 Phishing and Fraud Report. Available at: https://www.f5.com/labs/articles/threat-intelligence/2020-phishing-and-fraud-report(accessed: 11.11.2020).
-
Saleem Raja A., Vinodini R., Kavitha A. Lexical features based malicious URL detection using machine learning techniques. Materials Today: Proceedings, 2021, vol. 47, part 1, pp. 163–166. https://doi.org/10.1016/j.matpr.2021.04.041
-
Pradeepa G., Devi R. Review of malicious URL detection using machine learning. Advances in Intelligent Systems and Computing, 2021, vol. 1397, pp. 97–105. https://doi.org/10.1007/978-981-16-5301-8_7
-
Joshi A., Lloyd L., Westin P., Seethapathy S. Using lexical features for malicious URL detection - a machine learning approach. ArXiv, 2019, arXiv:1910.06277.
-
Tupsamudre H., Singh A.K., Lodha S. Everything is in the name – a URL based approach for phishing detection. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2019, vol. 11527, pp. 231–248. https://doi.org/10.1007/978-3-030-20951-3_21
-
Sahoo D., Liu C., Hoi S.C.H. Malicious URL Detection using Machine Learning: A Survey. arXiv, 2017, arXiv:1701.07179.
-
Ma J., Saul L.K., Savage S., Voelker G.M. Identifying suspicious URLs: an application of large-scale online learning. Proc. of the 26th International Conference on Machine Learning (ICML), 2009, pp. 681–688. https://doi.org/10.1145/1553374.1553462
-
Kevin McGrath D., Gupta M. Behind phishing: An examination of phisher modi operandi. Proc. of the 1st USENIX Workshop on Large-Scale Exploits and Emergent Threats: Botnets, Spyware, Worms, and More (LEET), 2008.
-
Hou Y.-T., Chang Y., Chen T., Laih C.-S., Chen C.-M. Malicious web content detection by machine learning. Expert Systems with Applications, 2010, vol. 37, no. 1, pp. 55–60. https://doi.org/10.1016/j.eswa.2009.05.023
-
Fu A.Y., Liu W., Deng X. Detecting phishing web pages with visual similarity assessment based on earth mover’s distance (EMD). IEEE Transactions on Dependable and Secure Computing, 2006, vol. 3, no. 4, pp. 301–311. https://doi.org/10.1109/TDSC.2006.50
-
Sahingoz O.K., Buber E., Demir O., Diri B. Machine learning based phishing detection from URLs. Expert Systems with Applications, 2019, vol. 117, pp. 345–357. https://doi.org/10.1016/j.eswa.2018.09.029
-
Patgiri R., Katari H., Kumar R., Sharma D. Empirical study on malicious URL detection using machine learning. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2019, vol. 11319, pp. 380–388. https://doi.org/10.1007/978-3-030-05366-6_31
-
Xuan C.D., Nguyen H.D., Tisenko V.N. Malicious URL detection based on machine learning. International Journal of Advanced Computer Science and Applications (IJACSA), 2020, vol. 11, no. 1. http://doi.org/10.14569/IJACSA.2020.0110119
-
Catak F.O., Sahinbas K., Dörtkardeş V. Malicious URL detection using machine learning. Artificial Intelligence Paradigms for Smart Cyber-Physical Systems, 2021, pp. 21. https://doi.org/10.4018/978-1-7998-5101-1.ch008
-
Butnaru A., Mylonas A., Pitropakis N. Towards lightweight URL-based phishing detection. Future Internet, 2021, vol. 13, no. 6, pp. 154. https://doi.org/10.3390/fi13060154
-
Browniee J. How to choose a feature selection method for machine learning. Available at: https://machinelearningmastery.com/feature-selection-with-real-and-categorical-data/(accessed: 20.08.2020).