Menu
Publications
2024
2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
Editor-in-Chief
Nikiforov
Vladimir O.
D.Sc., Prof.
Partners
doi: 10.17586/2226-1494-2024-24-5-770-778
A method for optimizing neural networks based on structural distillation using a genetic algorithm
Read the full article ';
Article in Russian
For citation:
Abstract
For citation:
Kuzmin V.N., Menisov A.B., Sabirov T.R. A method for optimizing neural networks based on structural distillation using a genetic algorithm. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2024, vol. 24, no. 5, pp. 770–778 (in Russian). doi: 10.17586/2226-1494-2024-24-5-770-778
Abstract
As neural networks become more complex, the number of parameters and required computations increases, which complicates the installation and operation of artificial intelligence systems on edge devices. Structural distillation can significantly reduce the resource intensity of using any neural networks. The paper presents a method for optimizing neural networks that combines the advantages of structural distillation and a genetic algorithm. Unlike evolutionary approaches used to search for the optimal architecture or distillation of neural networks, when forming distillation options, it is proposed to encode not only the parameters of the neural network, but also the connections between neurons. The experimental study was conducted on the VGG16 and ResNet18 models using the CIFAR-10 dataset. It is shown that structural distillation allows optimizing the size of neural networks while maintaining their generalizing ability, and the genetic algorithm is used to effectively search for optimal distillation options for neural networks, taking into account their structural complexity and performance. The obtained results demonstrated the effectiveness of the proposed method in reducing the size and improving the performance of networks with an acceptable loss of quality
Keywords: artificial intelligence, neural networks, structural distillation, genetic algorithm
References
References
- Spoorthi M., Indu Priya B., Kuppala M., Karpe V.S., Dharavath D. Automated resume classification system using ensemble learning. Proc. of the 9th International Conference on Advanced Computing and Communication Systems (ICACCS). V. 1, 2023, pp. 1782–1785. https://doi.org/10.1109/icaccs57279.2023.10112917
- Freire P.J., Osadchuk Y., Spinnler B., Napoli A., Schairer W., Costa N., Prilepsky J.E., Turitsyn S.K. Performance versus complexity study of neural network equalizers in coherent optical systems. Journal of Lightwave Technology, 2021, vol. 39, no. 19, pp. 6085–6096. https://doi.org/10.1109/jlt.2021.3096286
- Hankala T., Hannula M., Kontinen J., Virtema J. Complexity of neural network training and ETR: Extensions with effectively continuous functions. Proceedings of the AAAI Conference on Artificial Intelligence, 2024, vol. 38, no. 11, pp. 12278–12285. https://doi.org/10.1609/aaai.v38i11.29118
- Koonce B., Koonce B. ResNet 50. Convolutional Neural Networks with Swift for Tensorflow: Image Recognition and Dataset Categorization. Springer, 2021, pp. 63–72. https://doi.org/10.1007/978-1-4842-6168-2_6
- Floridi L., Chiriatti M. GPT-3: Its nature, scope, limits, and consequences. Minds and Machines, 2020, vol. 30, no. 4, pp. 681–694. https://doi.org/10.1007/s11023-020-09548-1
- Achiam J., Adler S., Agarwal S. et al. Gpt-4 technical report. arXiv, 2023, arXiv:2303.08774. https://doi.org/10.48550/arXiv.2303.08774
- Bodimani M. Assessing the impact of transparent AI systems in enhancing user trust and privacy. Journal of Science & Technology, 2024, vol. 5, no. 1, pp. 50–67. https://doi.org/10.55662/JST.2024.5102
- Lu Z., Li Z., Chiang C.-W., Yin M. Strategic adversarial attacks in AI-assisted decision making to reduce human trust and reliance. Proc. of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023, pp. 3020–3028. https://doi.org/10.24963/ijcai.2023/337
- He Y., Xiao L. Structured pruning for deep convolutional neural networks: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024, vol. 46, no. 5, pp. 2900–2919. https://doi.org/10.1109/tpami.2023.3334614
- Ding S., Zhang L., Pan M., Yuan X. PATROL: Privacy-oriented pruning for collaborative inference against model inversion attacks. Proc. of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2024, pp. 4704–4713. https://doi.org/10.1109/wacv57701.2024.00465
- Fang G., Ma X., Song M., Mi M.B., Wang X. Depgraph: Towards any structural pruning. Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023, pp. 16091–16101. https://doi.org/10.1109/cvpr52729.2023.01544
- Wen L., Zhang X., Bai H., Xu Z. Structured pruning of recurrent neural networks through neuron selection. Neural Networks, 2020, vol. 123, pp. 134–141. https://doi.org/10.1016/j.neunet.2019.11.018
- Zhao M., Peng J., Yu S., Liu L., Wu N. Exploring structural sparsity in CNN via selective penalty. IEEE Transactions on Circuits and Systems for Video Technology, 2022, vol. 32, no. 3, pp. 1658–1666. https://doi.org/10.1109/tcsvt.2021.3071532
- Shen M., Molchanov P., Yin H., Alvarez J.M. When to prune? a policy towards early structural pruning. Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 12237–12246. https://doi.org/10.1109/cvpr52688.2022.01193
- Katoch S., Chauhan S.S., Kumar V. A review on genetic algorithm: past, present, and future. Multimedia Tools and Applications, 2021, vol. 80, pp. 8091–8126. https://doi.org/10.1007/s11042-020-10139-6
- Simonyan K., Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv, 2014, arXiv:1409.1556. https://doi.org/10.48550/arXiv.1409.1556
- Zhou Y., Ren F., Nishide S., Kang X. Facial sentiment classification based on resnet-18 model. Proc. of the 2019 International Conference on Electronic Engineering and Informatics (EEI), 2019, pp. 463–466. https://doi.org/10.1109/eei48997.2019.00106
- Recht B., Roelofs R., Schmidt L., Shankar V. Do CIFAR-10 classifiers generalize to CIFAR-10?. arXiv, 2018, arXiv:1806.00451. https://doi.org/10.48550/arXiv.1806.00451
- Liu Q., Mukhopadhyay S. Unsupervised learning using pretrained CNN and associative memory bank. Proc. of the International Joint Conference on Neural Networks (IJCNN), 2018, pp. 01–08. https://doi.org/10.1109/ijcnn.2018.8489408
- Jeevan P., Sethi A. Vision Xformers: Efficient attention for image classification. arXiv, 2021, arXiv:2107.02239. https://doi.org/10.48550/arXiv.2107.02239
- Hou Y., Wu Z., Cai X., Zhu T. The application of improved densenet algorithm in accurate image recognition. Scientific Reports, 2024, vol. 14, no. 1, pp. 8645. https://doi.org/10.1038/s41598-024-58421-z