A method for optimizing neural networks based on structural distillation using a genetic algorithm

Kuzmin Vladimir N. , Menisov Artem B., Sabirov Timur R.

2024 , VOLUME 24, NUMBER 5 ( september-october )

ISSN 2226-1494 (print), ISSN 2500-0373 (online)

Publications

Editor-in-Chief

Nikiforov
Vladimir O.
D.Sc., Prof.

Partners

doi: 10.17586/2226-1494-2024-24-5-770-778

A method for optimizing neural networks based on structural distillation using a genetic algorithm

V. N. Kuzmin, A. B. Menisov, T. R. Sabirov

Read the full article

Article in Russian

For citation:

Kuzmin V.N., Menisov A.B., Sabirov T.R. A method for optimizing neural networks based on structural distillation using a genetic algorithm. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2024, vol. 24, no. 5, pp. 770–778 (in Russian). doi: 10.17586/2226-1494-2024-24-5-770-778

Abstract

As neural networks become more complex, the number of parameters and required computations increases, which complicates the installation and operation of artificial intelligence systems on edge devices. Structural distillation can significantly reduce the resource intensity of using any neural networks. The paper presents a method for optimizing neural networks that combines the advantages of structural distillation and a genetic algorithm. Unlike evolutionary approaches used to search for the optimal architecture or distillation of neural networks, when forming distillation options, it is proposed to encode not only the parameters of the neural network, but also the connections between neurons. The experimental study was conducted on the VGG16 and ResNet18 models using the CIFAR-10 dataset. It is shown that structural distillation allows optimizing the size of neural networks while maintaining their generalizing ability, and the genetic algorithm is used to effectively search for optimal distillation options for neural networks, taking into account their structural complexity and performance. The obtained results demonstrated the effectiveness of the proposed method in reducing the size and improving the performance of networks with an acceptable loss of quality

Keywords: artificial intelligence, neural networks, structural distillation, genetic algorithm

References

Spoorthi M., Indu Priya B., Kuppala M., Karpe V.S., Dharavath D. Automated resume classification system using ensemble learning. Proc. of the 9^th International Conference on Advanced Computing and Communication Systems (ICACCS). V. 1, 2023, pp. 1782–1785. https://doi.org/10.1109/icaccs57279.2023.10112917
Freire P.J., Osadchuk Y., Spinnler B., Napoli A., Schairer W., Costa N., Prilepsky J.E., Turitsyn S.K. Performance versus complexity study of neural network equalizers in coherent optical systems. Journal of Lightwave Technology, 2021, vol. 39, no. 19, pp. 6085–6096. https://doi.org/10.1109/jlt.2021.3096286
Hankala T., Hannula M., Kontinen J., Virtema J. Complexity of neural network training and ETR: Extensions with effectively continuous functions. Proceedings of the AAAI Conference on Artificial Intelligence, 2024, vol. 38, no. 11, pp. 12278–12285. https://doi.org/10.1609/aaai.v38i11.29118
Koonce B., Koonce B. ResNet 50. Convolutional Neural Networks with Swift for Tensorflow: Image Recognition and Dataset Categorization. Springer, 2021, pp. 63–72. https://doi.org/10.1007/978-1-4842-6168-2_6
Floridi L., Chiriatti M. GPT-3: Its nature, scope, limits, and consequences. Minds and Machines, 2020, vol. 30, no. 4, pp. 681–694. https://doi.org/10.1007/s11023-020-09548-1
Achiam J., Adler S., Agarwal S. et al. Gpt-4 technical report. arXiv, 2023, arXiv:2303.08774. https://doi.org/10.48550/arXiv.2303.08774
Bodimani M. Assessing the impact of transparent AI systems in enhancing user trust and privacy. Journal of Science & Technology, 2024, vol. 5, no. 1, pp. 50–67. https://doi.org/10.55662/JST.2024.5102
Lu Z., Li Z., Chiang C.-W., Yin M. Strategic adversarial attacks in AI-assisted decision making to reduce human trust and reliance. Proc. of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023, pp. 3020–3028. https://doi.org/10.24963/ijcai.2023/337
He Y., Xiao L. Structured pruning for deep convolutional neural networks: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024, vol. 46, no. 5, pp. 2900–2919. https://doi.org/10.1109/tpami.2023.3334614
Ding S., Zhang L., Pan M., Yuan X. PATROL: Privacy-oriented pruning for collaborative inference against model inversion attacks. Proc. of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2024, pp. 4704–4713. https://doi.org/10.1109/wacv57701.2024.00465
Fang G., Ma X., Song M., Mi M.B., Wang X. Depgraph: Towards any structural pruning. Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023, pp. 16091–16101. https://doi.org/10.1109/cvpr52729.2023.01544
Wen L., Zhang X., Bai H., Xu Z. Structured pruning of recurrent neural networks through neuron selection. Neural Networks, 2020, vol. 123, pp. 134–141. https://doi.org/10.1016/j.neunet.2019.11.018
Zhao M., Peng J., Yu S., Liu L., Wu N. Exploring structural sparsity in CNN via selective penalty. IEEE Transactions on Circuits and Systems for Video Technology, 2022, vol. 32, no. 3, pp. 1658–1666. https://doi.org/10.1109/tcsvt.2021.3071532
Shen M., Molchanov P., Yin H., Alvarez J.M. When to prune? a policy towards early structural pruning. Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 12237–12246. https://doi.org/10.1109/cvpr52688.2022.01193
Katoch S., Chauhan S.S., Kumar V. A review on genetic algorithm: past, present, and future. Multimedia Tools and Applications, 2021, vol. 80, pp. 8091–8126. https://doi.org/10.1007/s11042-020-10139-6
Simonyan K., Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv, 2014, arXiv:1409.1556. https://doi.org/10.48550/arXiv.1409.1556
Zhou Y., Ren F., Nishide S., Kang X. Facial sentiment classification based on resnet-18 model. Proc. of the 2019 International Conference on Electronic Engineering and Informatics (EEI), 2019, pp. 463–466. https://doi.org/10.1109/eei48997.2019.00106
Recht B., Roelofs R., Schmidt L., Shankar V. Do CIFAR-10 classifiers generalize to CIFAR-10?. arXiv, 2018, arXiv:1806.00451. https://doi.org/10.48550/arXiv.1806.00451
Liu Q., Mukhopadhyay S. Unsupervised learning using pretrained CNN and associative memory bank. Proc. of the International Joint Conference on Neural Networks (IJCNN), 2018, pp. 01–08. https://doi.org/10.1109/ijcnn.2018.8489408
Jeevan P., Sethi A. Vision Xformers: Efficient attention for image classification. arXiv, 2021, arXiv:2107.02239. https://doi.org/10.48550/arXiv.2107.02239
Hou Y., Wu Z., Cai X., Zhu T. The application of improved densenet algorithm in accurate image recognition. Scientific Reports, 2024, vol. 14, no. 1, pp. 8645. https://doi.org/10.1038/s41598-024-58421-z

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License