Menu
Publications
2025
2024
2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
Editor-in-Chief
Nikiforov
Vladimir O.
D.Sc., Prof.
Partners
doi: 10.17586/2226-1494-2025-25-5-988-995
Assessment of the reliability of a recoverable container virtualization cluster
Read the full article
Article in Russian
For citation:
Abstract
For citation:
Bogatyrev V.A., Phung V.Q. Assessment of the reliability of a recoverable container virtualization cluster. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2025, vol. 25, no. 5, pp. 988–995 (in Russian). doi: 10.17586/2226-1494-2025-25-5-988-995
Abstract
Container virtualization technology is increasingly being used in the development of fault-tolerant clusters with high availability and low request processing latency. In designing highly reliable clusters, a key task is the structural- parametric model-oriented synthesis which takes into account the impact of the number of deployed containers on performance, request processing latency, and system reliability. Justifying the choice of solutions to ensure high cluster reliability currently requires the development of reliability models for recoverable container virtualization clusters during reconfiguration, considering the migration of virtual containers. The basis for decisions to ensure high cluster availability is the development of models for a recoverable cluster during reconfiguration, taking into account the migration of virtual containers. The novelty of the proposed Markov model of a cluster lies in considering a two-stage recovery of its operability, determining the impact of the number of containers to be migrated during reconfiguration — both before and after the physical recovery of failed servers — on cluster reliability. Two options for container migration during cluster recovery are considered. In the first scenario, during the physical recovery phase of a failed server, container migration to a functional server does not occur, while in the second scenario it does. In the second stage of reconfiguration, following the physical recovery of a failed server, container migration takes place, allowing for either an increase or decrease in the number of containers deployed on them. Based on the proposed Markov models of cluster reliability with container virtualization, an evaluation of its readiness coefficient is provided, and the influence of the number of containers loaded during migration at the two reconfiguration stages on system reliability is determined. The proposed Markov models of cluster reliability with container virtualization are aimed at justifying design decisions for organizing and restoring cluster operability after server failures, considering the impact of container migration implementation options on system availability. Future research will analyze the impact of container migration options on both cluster availability and request processing latency at the two considered reconfiguration stages.
Keywords: fault tolerance, availability factor, container virtualization, cluster, container migration, Markov model, reliability
References
References
- Goyal P., Deora S.S. Reliability of Trust Management Systems in Cloud Computing. Indian Journal of Cryptography and Network Security, 2022, vol. 2, no. 1, pp. 1–5. https://doi.org/10.54105/ijcns.C1417.051322
- Chen G., Guan N., Huang K., Yi W. Fault-tolerant real-time tasks scheduling with dynamic fault handling. Journal of Systems Architecture, 2020, vol. 102, pp. 101688. https://doi.org/10.1016/j.sysarc.2019.101688
- Shubinsky I.B., Rozenberg I.N., Papic L. Adaptive fault tolerance in real-time information systems. Reliability: Theory and Applications, 2017, vol. 12, no. 1 (44), pp. 18–25.
- Chinnaiah N.R., Niranjan N.Fault tolerant software systems using software configurations for cloud computing. Journal of Cloud Computing, 2018, vol. 7, pp. 3. https://doi.org/10.1186/s13677-018-0104-9
- Srivastava A., Kumar N. Queueing model based dynamic scalability for containerized cloud. International Journal of Advanced Computer Science and Applications, 2023, vol. 14, no. 1, pp. 465–472. https://doi.org/10.14569/IJACSA.2023.0140150
- Shukur H.M., Zeebaree S.R.M., Zebari R.R., Zeebaree D.Q., Ahmed O.M., Salih A.A. Cloud computing virtualization of resources allocation for distributed systems. Journal of Applied Science and Technology Trends, 2020, vol. 1, no. 2, pp. 98–105. https://doi.org/10.38094/jastt1331
- Alam I., Sharif K., Li F., Latif Z., Karim M.M., Biswas S., Nour B., Wang Y. A survey of network virtualization techniques for Internet of things using SDN and NFV. ACM Computing Surveys, 2020, vol. 53, no. 2, pp. 1–40. https://doi.org/10.1145/3379444
- Chen H., Qin W., Wang L. Task partitioning and offloading in IoT cloud-edge collaborative computing framework: a survey. Journal of Cloud Computing, 2022, vol. 11, pp. 86. https://doi.org/10.1186/s13677-022-00365-8
- Kushchazli A., Safargalieva A., Kochetkova I., Gorshenin A. Queuing model with customer class movement across server groups for analyzing virtual machine migration in cloud computing. Mathematics, 2024, vol. 12, no. 3, pp. 468. https://doi.org/10.3390/math12030468
- Kumari P., Kaur P. A survey of fault tolerance in cloud computing. Journal of King Saud University – Computer and Information Sciences, 2021, vol. 33, no. 10, pp. 1159–1176. https://doi.org/10.1016/j.jksuci.2018.09.021
- Tatarnikova T.M., Arkhiptsev E.D. Designing fault-tolerant systems with micro-service architecture. Proc. of the 27th International Conference on Soft Computing and Measurements (SCM), 2024, pp. 348–351. https://doi.org/10.1109/SCM62608.2024.10554143
- Bogatyrev V.A. Protocols for dynamic distribution of requests through a bus with variable logic ring for reception authority transfer. Automatic Control and Computer Sciences, 1999, vol. 33, no. 1, pp. 57–63.
- Sovetov B.Ya., Tatarnikova T.M., Poymanova E.D. Storage scaling management model. Information and Control Systems, 2020, no. 5 (108), pp. 43–49. https://doi.org/10.31799/1684-8853-2020-5-43-49
- Bogatyrev A.V., Bogatyrev V.A., Bogatyrev S.V. The probability of timeliness of a fully connected exchange in a redundant real-time communication system. Proc. of the Wave Electronics and its Application in Information and Telecommunication Systems (WECONF), 2020, pp. 1–4. https://doi.org/10.1109/WECONF48837.2020.9131517
- Bogatyrev V.A., Bogatyrev S.V., Bogatyrev A.V. Control of multipath transmissions in the nodes of switching segments of reserved paths. Proc. of the International Conference on Information, Control, and Communication Technologies (ICCT), 2022, pp. 1–5. https://doi.org/10.1109/ICCT56057.2022.9976839
- Terskov V., Sakash I. The reliability evaluation of local computer networks using markov model of multiple heterogeneous groups of switches. E3S Web of Conferences, 2024, vol. 592, pp. 3036. https://doi.org/10.1051/e3sconf/202459203036
- Polovko A.M., Gurov S.V. Fundamentals of Reliability Theory. St. Petersburg, BHV-Petersburg Publ., 2006, 702 p. (in Russian)
- Koren I. Fault-Tolerant Systems. Morgan Kaufmann, 2007, 400 p.
- Aysan H. Fault-tolerance strategies and probabilistic guarantees for real-time systems. Doctoral dissertation, Mälardalen University, 2012,109 p.
- Rakhman P.A., Sharipov M.I. Reliability model of a two-node cluster of high-availability applications in enterprise management systems. Ekonomika i menedzhment sistem upravleniya, 2015,no. 3 (17),pp. 85–102.(in Russian)
- Khomonenko A.D., Blagoveshchenskaya E.A., Prourzin O.V., Andruk A.A. Forecasting the reliability of a cluster computing system using a semi-Markov model of alternating processes and monitoring. High Technologies in Earth Space Research. H&ES Research, 2018, vol. 10, no. 4, pp. 72–82. (in Russian). https://doi.org/10.24411/2409-5419-2018-10099
- Bogatyrev V.A., Vinokurova M.S. Control and safety of operation of duplicated computer systems. Communications in Computer and Information Science, 2017, vol. 700, pp. 331–342. https://doi.org/10.1007/978-3-319-66836-9_28
- Bogatyrev V.A. Exchange of duplicated computing complexes in fault-tolerant systems. Automatic Control and Computer Sciences, 2011, vol. 45, no. 5, pp. 268–276. https://doi.org/10.3103/S014641161105004X
- Bogatyrev V.A., Bogatyrev S.V., Bogatyrev A.V. Assessment of the readiness of a computer system for timely servicing of requests when combined with information recovery of memory after failures. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2023, vol. 23, no. 3, pp. 608–617. (in Russian). https://doi.org/10.17586/2226-1494-2023-23-3-608-617
- Compastié M., Badonnel R., Festor O., He R. From virtualization security issues to cloud protection opportunities: An in-depth analysis of system virtualization models. Computers & Security, 2020, vol. 97, pp. 101905. https://doi.org/10.1016/j.cose.2020.101905
- Choudhary A., Govil M.C., Singh G., Awasthi L.K., Pilli E.S., Kapil D. A critical survey of live virtual machine migration techniques. Journal of Cloud Computing, 2017, vol. 6, pp. 23. https://doi.org/10.1186/s13677-017-0092-1
- Aleksankov S.M. Models of live migration with iterative approach and move of virtual machines. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2015, vol.15, no. 6, pp. 1098–1104. (in Russian). https://doi.org/10.17586/2226-1494-2015-15-6-1098-1104
- Bogatyrev V.A., Derkach A.N. Evaluation of a cyber-physical computing system with migration of virtual machines during continuous computing. Computers, 2020, vol. 9, no. 2, pp. 42. https://doi.org/10.3390/computers9020042
- Kleinrock L. Queueing Systems. Volume 1: Theory. Wiley-Interscience, 1975, 417 p.
- Phung V.Q., Bogatyrev V.F., Karmanovskiy N.S., Le V.H. Evaluation of probabilistic- temporal characteristics of a computer system with container virtualization. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2024, vol. 24, no. 2, pp. 249–255. (in Russian). https://doi.org/10.17586/2226-1494-2024-24-2-249-255
- Nguyen T.A., Kim D.S., Park J.S. A comprehensive availability modeling and analysis of a virtualized servers system using stochastic reward nets. The Scientific World Journal. 2014. V. 2014. P. 165316. https://doi.org/10.1155/2014/165316

