Obfuscated malware detection using deep neural network with ANOVA feature selection on CIC-MalMem-2022 dataset

Hadjila Mourad , Merzoug Mohammed , Ferhi Wafaa, Moussaoui Djillali, Bouidaine Al Baraa , Hachemi Mohammed Hicham

2024 , VOLUME 24, NUMBER 5 ( september-october )

ISSN 2226-1494 (print), ISSN 2500-0373 (online)

Publications

Editor-in-Chief

Nikiforov
Vladimir O.
D.Sc., Prof.

Partners

doi: 10.17586/2226-1494-2024-24-5-849-857

Obfuscated malware detection using deep neural network with ANOVA feature selection on CIC-MalMem-2022 dataset

M. Hadjila, M. Merzoug, W. Ferhi, D. Moussaoui, A. Bouidaine, M. Hachemi

Read the full article

Article in English

For citation:

Hadjila M., Merzoug M., Ferhi W., Moussaoui D., Bouidaine A.B., Hachemi M.H. Obfuscated malware detection using deep neural network with ANOVA feature selection on CIC-MalMem-2022 dataset. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2024, vol. 24, no. 5, pp. 849–857. doi: 10.17586/2226-1494-2024-24-5-849-857

Abstract

Malware analysis is the process of dissecting malicious software to understand its functionality, behavior, and potential risks. Artificial Intelligence (AI) and deep learning are ushering in a new era of automated, intelligent, and adaptive malware analysis. This convergence of AI and deep learning promises to revolutionize the way cybersecurity professionals detect, analyze and respond to malware threats. This paper proposed a Deep Neural Network (DNN) model built from features selected by ANalysis Of Variance (ANOVA) F-test (DNN-ANOVA) to increase accuracy by identifying informative features. ANOVA is a feature selection method used for numerical input data when the target variable is categorical. The top k most relevant features are those whose score values are greater than a certain threshold equal to the ratio between the sum of all features scores and the total number of features. Experiments are conducted on CIC-MalMem-2022 dataset. Malware Analysis is performed using binary classification to detect the presence or absence of malware and multiclass classification to detect not only the malware but also its type. According to the test results, DNN-ANOVA model achieves best values of 100 %, 99.99 %, 99.99 %, and 99.98 % in terms of precision, accuracy, F1-score and recall respectively for binary classification. In addition, DNN-ANOVA outperforms the current works with an overall accuracy rate of 85.83 %, and 73.98 % for family attacks and individual attacks respectively in the case of multiclass classification.

Keywords: malware detection, deep learning, ANOVA feature selection, binary classification

References

Kramer S.,Bradfield J.C. A general definition of malware. Journal in Computer Virology, 2010, vol. 6, no. 2, pp. 105–114.https://doi.org/10.1007/s11416-009-0137-1
Li C., Gaudiot J.L. Detecting malicious attacks exploiting hardware vulnerabilities using performance counters. Proc. of the2019 IEEE 43^rd Annual Computer Software and Applications Conference (COMPSAC).V. 1,2019, pp. 588–597. https://doi.org/10.1109/compsac.2019.00090
Sinanovic H., Mrdovic S. Analysis of Mirai malicious software. Proc. of the25^th International Conference on Software, Telecommunications and Computer Networks (SoftCOM), 2017, pp. 1–5. https://doi.org/10.23919/softcom.2017.8115504
Singh R., Kumar H., Singla R.K., Ketti R.R. Internet attacks and intrusion detection system: A review of the literature. Online Information Review, 2017, vol. 41, no. 2, pp. 171–184.https://doi.org/10.1108/oir-12-2015-0394
Yadav B., Tokekar S. Deep learning in malware identification and classification. Malware Analysis Using Artificial Intelligence and Deep Learning.Springer, Cham,2021,pp. 163–205.https://doi.org/10.1007/978-3-030-62582-5_6
Kertysova K., Frinking E., van den Dool K., Maricic A., Bhattacharyya K. Cybersecurity: Ensuring awareness and resilience of the private sector across Europe in face of mounting cyber risks-Study. Bruxelles, Belgium,European Economic and Social Committee,2018.
Gopinath M., Sethuraman S.C. A comprehensive survey on deep learning based malware detection techniques. Computer Science Review,2023, vol. 47, pp. 100529.https://doi.org/10.1016/j.cosrev.2022.100529
Faruk M.J.H., Shahriar H., Valero M., Barsha F.L., Sobhan S., Khan M.A.,Whitman M., Cuzzocrea A., Lo D., Rahman A.,Wu F. Malware detection and prevention using artificial intelligence techniques. Proc. of theIEEE International Conference on Big Data (Big Data),2021, pp. 5369–5377.https://doi.org/10.1109/bigdata52589.2021.9671434
Vigna G. How AI will help in the fight against malware. Retrieved from TechBeacon,2020.
Schmitt M. Securing the Digital World: Protecting smart infrastructures and digital industries with Artificial Intelligence (AI)-enabled malware and intrusion detection. Journal of Industrial Information Integration, 2023, vol. 36, pp. 100520. https://doi.org/10.1016/j.jii.2023.100520
Aljabri M., Alhaidari F., Albuainain A., Alrashidi S., Alansari J., Alqahtani W., Alshaya J. Ransomware detection based on machine learning using memory features. Egyptian Informatics Journal, 2024, vol. 25, pp. 100445. https://doi.org/10.1016/j.eij.2024.100445
Ababneh M., Aljarrah A. Cybersecurity: Malware multi-attack detector on android-based devices using deep learning methods. Journal of Theoretical and Applied Information Technology, 2024, vol. 102, no. 1, pp. 144–166.
Majid A.A.M., Alshaibi A.J., Kostyuchenko E., Shelupanov A. A review of artificial intelligence based malware detection using deep learning. Materials Today: Proceedings, 2023, vol. 80, part 3, pp. 2678–2683. https://doi.org/10.1016/j.matpr.2021.07.012
Riaz S., Latif S., Usman S.M., Ullah S.S., Algarni A.D., Yasin A., Anwar A., Elmannai H., Hussain S. Malware Detection in Internet of Things (IoT) devices using deep learning. Sensors, 2022, vol. 22, no. 23, pp. 9305. https://doi.org/10.3390/s22239305
Xing X., Jin X., Elahi H., Jiang H., Wang G. A malware detection approach using autoencoder in deep learning. IEEE Access, 2022, vol. 10, pp. 25696–25706. https://doi.org/10.1109/access.2022.3155695
Ucci D., Aniello L., Baldoni R. Survey of machine learning techniques for malware analysis. Computers & Security, 2019, vol. 81, pp. 123–147. https://doi.org/10.1016/j.cose.2018.11.001
Shafin S.S., Karmakar G., Mareels I. Obfuscated memory malware detection in resource-constrained IoT devices for smart city applications. Sensors, 2023, vol. 23, no. 11, pp. 5348.https://doi.org/10.3390/s23115348
Mezina A., Burget R. Obfuscated malware detection using dilated convolutional network. Proc. of the14^th International Congress on Ultra Modern Telecommunications and Control Systems and Workshops (ICUMT),2022, pp. 110–115.https://doi.org/10.1109/icumt57764.2022.9943443
Brownlee J. How to perform feature selection with numerical input data. Machine Learning Mastery,2020.
Brownlee J. How to choose a feature selection method for machine learning. Machine Learning Mastery, 2019.
Cai J., Luo J., Wang S., Yang S. Feature selection in machine learning: A new perspective. Neurocomputing, 2018, vol. 300, pp. 70–79.https://doi.org/10.1016/j.neucom.2017.11.077
Payton A.M. A review of spyware campaigns and strategies to combat them. Proc.of the 3^rd Annual Conference on Information Security Curriculum Development,2006, pp. 136–141.https://doi.org/10.1145/1231047.1231077
Carrier T., Victor P., Tekeoglu A., Lashkari A. Detecting obfuscated malware using memory feature engineering. Proc. of the 8^th International Conference on Information Systems Security and Privacy ICISSP. V. 1,2022, pp. 177–188. https://doi.org/10.5220/0010908200003120
Mallikarajunan K.N., Preethi S.R., Selvalakshmi S., Nithish N. Detection of spyware in software using virtual environment. Proc. of the3^rd International Conference on Trends in Electronics and Informatics (ICOEI),2019, pp. 1138–1142. https://doi.org/10.1109/icoei.2019.8862547
Jonasson D., Sigholm J. What is Spyware?.TDDC03 Projects, Department of Computer and Information Science.Sewden, Linkopings University, 2005.
Pelchen-Matthews A., Raposo G., Marsh M. Endosomes, exosomes and Trojan viruses. Trends in Microbiology, 2004, vol. 12, no. 7, pp. 310–316.https://doi.org/10.1016/j.tim.2004.05.004
Liu Y., Mondal A., Chakraborty A., Zuzak M., Jacobsen N., Xing D., Srivastava A. A survey on neural trojans. Proc. of the21^st International Symposium on Quality Electronic Design (ISQED),2020, pp. 33–39.https://doi.org/10.1109/isqed48828.2020.9137011
Brewer R. Ransomware attacks: detection, prevention and cure. Network Security,2016, vol. 2016, no. 9, pp. 5–9.https://doi.org/10.1016/s1353-4858(16)30086-1
Tuttle H. Ransomware attackers turn to double extortion. Risk Management, 2021, vol. 68, no. 2, pp. 8–9.
Nershi K., Grossman S. Assessing the Political Motivations Behind Ransomware Attacks. SSRN Electronic Journal, 2023.https://doi.org/10.2139/ssrn.4507111
Casas P., Blancas J., Villanueva A. Ransomware Report 2023: targets, motives, and trends. Outpost24. 07 Feb. 2023. Available: https://outpost24.com/blog/ransomware-report-2023-targets-motives-and-trends/ (accessed: 01.08.24).
Sawyer S.F. Analysis of variance: the fundamental concepts. Journal of Manual & Manipulative Therapy, 2009, vol. 17, no. 2, pp. 27E–38E.https://doi.org/10.1179/jmt.2009.17.2.27e

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License