doi: 10.17586/2226-1494-2024-24-5-770-778


A method for optimizing neural networks based on structural distillation using a genetic algorithm

V. N. Kuzmin, A. B. Menisov, T. R. Sabirov


Read the full article  ';
Article in Russian

For citation:
Kuzmin V.N., Menisov A.B., Sabirov T.R. A method for optimizing neural networks based on structural distillation using a genetic algorithm. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2024, vol. 24, no. 5, pp. 770–778 (in Russian). doi: 10.17586/2226-1494-2024-24-5-770-778


Abstract
As neural networks become more complex, the number of parameters and required computations increases, which complicates the installation and operation of artificial intelligence systems on edge devices. Structural distillation can significantly reduce the resource intensity of using any neural networks. The paper presents a method for optimizing neural networks that combines the advantages of structural distillation and a genetic algorithm. Unlike evolutionary approaches used to search for the optimal architecture or distillation of neural networks, when forming distillation options, it is proposed to encode not only the parameters of the neural network, but also the connections between neurons. The experimental study was conducted on the VGG16 and ResNet18 models using the CIFAR-10 dataset. It is shown that structural distillation allows optimizing the size of neural networks while maintaining their generalizing ability, and the genetic algorithm is used to effectively search for optimal distillation options for neural networks, taking into account their structural complexity and performance. The obtained results demonstrated the effectiveness of the proposed method in reducing the size and improving the performance of networks with an acceptable loss of quality

Keywords: artificial intelligence, neural networks, structural distillation, genetic algorithm

References
  1. Spoorthi M., Indu Priya B., Kuppala M., Karpe V.S., Dharavath D. Automated resume classification system using ensemble learning. Proc. of the 9th International Conference on Advanced Computing and Communication Systems (ICACCS). V. 1, 2023, pp. 1782–1785. https://doi.org/10.1109/icaccs57279.2023.10112917
  2. Freire P.J., Osadchuk Y., Spinnler B., Napoli A., Schairer W., Costa N., Prilepsky J.E., Turitsyn S.K. Performance versus complexity study of neural network equalizers in coherent optical systems. Journal of Lightwave Technology, 2021, vol. 39, no. 19, pp. 6085–6096. https://doi.org/10.1109/jlt.2021.3096286
  3. Hankala T., Hannula M., Kontinen J., Virtema J. Complexity of neural network training and ETR: Extensions with effectively continuous functions. Proceedings of the AAAI Conference on Artificial Intelligence, 2024, vol. 38, no. 11, pp. 12278–12285. https://doi.org/10.1609/aaai.v38i11.29118
  4. Koonce B., Koonce B. ResNet 50. Convolutional Neural Networks with Swift for Tensorflow: Image Recognition and Dataset Categorization. Springer, 2021, pp. 63–72. https://doi.org/10.1007/978-1-4842-6168-2_6
  5. Floridi L., Chiriatti M. GPT-3: Its nature, scope, limits, and consequences. Minds and Machines, 2020, vol. 30, no. 4, pp. 681–694. https://doi.org/10.1007/s11023-020-09548-1
  6. Achiam J., Adler S., Agarwal S. et al. Gpt-4 technical report. arXiv, 2023, arXiv:2303.08774. https://doi.org/10.48550/arXiv.2303.08774
  7. Bodimani M. Assessing the impact of transparent AI systems in enhancing user trust and privacy. Journal of Science & Technology, 2024, vol. 5, no. 1, pp. 50–67. https://doi.org/10.55662/JST.2024.5102
  8. Lu Z., Li Z., Chiang C.-W., Yin M. Strategic adversarial attacks in AI-assisted decision making to reduce human trust and reliance. Proc. of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023, pp. 3020–3028. https://doi.org/10.24963/ijcai.2023/337
  9. He Y., Xiao L. Structured pruning for deep convolutional neural networks: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024, vol. 46, no. 5, pp. 2900–2919. https://doi.org/10.1109/tpami.2023.3334614
  10. Ding S., Zhang L., Pan M., Yuan X. PATROL: Privacy-oriented pruning for collaborative inference against model inversion attacks. Proc. of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2024, pp. 4704–4713. https://doi.org/10.1109/wacv57701.2024.00465
  11. Fang G., Ma X., Song M., Mi M.B., Wang X. Depgraph: Towards any structural pruning. Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023, pp. 16091–16101. https://doi.org/10.1109/cvpr52729.2023.01544
  12. Wen L., Zhang X., Bai H., Xu Z. Structured pruning of recurrent neural networks through neuron selection. Neural Networks, 2020, vol. 123, pp. 134–141. https://doi.org/10.1016/j.neunet.2019.11.018
  13. Zhao M., Peng J., Yu S., Liu L., Wu N. Exploring structural sparsity in CNN via selective penalty. IEEE Transactions on Circuits and Systems for Video Technology, 2022, vol. 32, no. 3, pp. 1658–1666. https://doi.org/10.1109/tcsvt.2021.3071532
  14. Shen M., Molchanov P., Yin H., Alvarez J.M. When to prune? a policy towards early structural pruning. Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 12237–12246. https://doi.org/10.1109/cvpr52688.2022.01193
  15. Katoch S., Chauhan S.S., Kumar V. A review on genetic algorithm: past, present, and future. Multimedia Tools and Applications, 2021, vol. 80, pp. 8091–8126. https://doi.org/10.1007/s11042-020-10139-6
  16. Simonyan K., Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv, 2014, arXiv:1409.1556. https://doi.org/10.48550/arXiv.1409.1556
  17. Zhou Y., Ren F., Nishide S., Kang X. Facial sentiment classification based on resnet-18 model. Proc. of the 2019 International Conference on Electronic Engineering and Informatics (EEI), 2019, pp. 463–466. https://doi.org/10.1109/eei48997.2019.00106
  18. Recht B., Roelofs R., Schmidt L., Shankar V. Do CIFAR-10 classifiers generalize to CIFAR-10?. arXiv, 2018, arXiv:1806.00451. https://doi.org/10.48550/arXiv.1806.00451
  19. Liu Q., Mukhopadhyay S. Unsupervised learning using pretrained CNN and associative memory bank. Proc. of the International Joint Conference on Neural Networks (IJCNN), 2018, pp. 01–08. https://doi.org/10.1109/ijcnn.2018.8489408
  20. Jeevan P., Sethi A. Vision Xformers: Efficient attention for image classification. arXiv, 2021, arXiv:2107.02239. https://doi.org/10.48550/arXiv.2107.02239
  21. Hou Y., Wu Z., Cai X., Zhu T. The application of improved densenet algorithm in accurate image recognition. Scientific Reports, 2024, vol. 14, no. 1, pp. 8645. https://doi.org/10.1038/s41598-024-58421-z


Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License
Copyright 2001-2024 ©
Scientific and Technical Journal
of Information Technologies, Mechanics and Optics.
All rights reserved.

Яндекс.Метрика