doi: 10.17586/2226-1494-2021-21-4-571-577


A study of human motion in computer vision systems based on a skeletal model

S. A. Kazakova, P. A. Leonteva, M. I. Frolova, J. V. Donetskaya, I. Y. Popov, A. Y. Kouznetsov


Read the full article  ';
Article in Russian

For citation:

Kazakova S.A., Leonteva P.A., Frolova M.I., Donetskaya Ju.V., Popov I.Yu., Kuznetsov A.Yu. A study of human motion in computer vision systems based on a skeletal model. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2021, vol. 21, no. 4, pp. 571–577 (in Russian). doi: 10.17586/2226-1494-2021-21-4-571-577



Abstract
Methods of studying human motion in computer vision systems can be divided into two types. These are analysis in two-dimensional and three-dimensional space. The former uses a single camera image and/ or multiple body sensors. Such an approach leads to a rapid accumulation of error and, consequently, low accuracy of the figure representation. Multiple cameras are usually used in the case of three-dimensional space analysis, while the objects are represented as sets of volumetric elements. Despite the high accuracy of this method, it is associated with high computational complexity and internal network load. The purpose of the paper is to develop a model using a single camera, while approaching three-dimensional space analysis methods in terms of accuracy. In this paper a human figure is represented as a skeleton. The skeleton is described by an acyclic connected graph. The general structure of a human figure is analyzed. Fifteen basic points are selected. Physical and logical connections between them were studied and mathematically described. The velocity and spatial characteristics of the points and connections outline the general dynamics of motion. The study describes a model of human motion and gives the option for model construction on the example of a particular image. The developed algorithm for collection and analysis of information estimates relative locations and velocity characteristics of the graph elements. The model can be used for acquisition of information about the reference dynamics of human movements. In case of detecting major differences between the reference and the reality, the behavior is defined as deviant. Thus, the obtained algorithm can be applied in computer vision systems for detection and analysis of human movements.

Keywords: computer vision, human motion analysis, behavioral analytics, motion detection, skeletal model

Acknowledgements. This work is partially supported by the Ministry of Science and Higher Education of Russian Federation, passport of goszadanie no. 2019-0898

References
  1. Valčık J. Similarity models for human motion data. Ph.D. Thesis. Brno: Masaryk University, 2016. Available at: https://is.muni.cz/th/wx926/thesis.pdf (accessed: 07.04.2021).
  2. Rogez G., Weinzaepfel P., Schmid C. LCR-Net++: Multi-person 2D and 3D pose detection in natural images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, vol. 42, no. 5, pp. 1146–1161. https://doi.org/10.1109/TPAMI.2019.2892985
  3. Ke Q., Bennamoun M., An S., Sohel F., Boussaid F. A new representation of skeleton sequences for 3D action recognition. Proc. 30th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 4570–4579. https://doi.org/10.1109/CVPR.2017.486
  4. Vox J.P., Wallhoff F. Preprocessing and normalization of 3D-skeleton-data for human motion recognition. Proc. IEEE Life Sciences Conference (LSC). Montreal, QC, Canada. 2018, pp. 279–282. https://doi.org/10.1109/LSC.2018.8572153
  5. Shin S., Halilaj E. Multi-view human pose and shape estimation using learnable volumetric aggregation. arXiv.org, 2020, arXiv:2011.13427
  6. Innmann M., Zollhofer M., Nießner M., Theobalt C., Stamminger M. Volumedeform: Real-time volumetric non-rigid reconstruction. Lecture Notes in Computer Science, 2016, vol. 9912, pp. 362–379. https://doi.org/10.1007/978-3-319-46484-8_22
  7. Liu Y., Wang K., Li G., Lin L. Semantics-aware adaptive knowledge distillation for sensor-to-vision action recognition. IEEE Transactions on Image Processing, 2021, vol. 30, pp. 5573–5588. https://doi.org/10.1109/TIP.2021.3086590
  8. Xiang D., Joo H., Sheikh Y. Monocular total capture: Posing face, body, and hands in the wild. Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 10957–10966. https://doi.org/10.1109/CVPR.2019.01122
  9. Tanke J., Gall J. Iterative greedy matching for 3D human pose tracking from multiple views. Lecture Notes in Computer Science, 2019, vol. 11824, pp. 537–550. https://doi.org/10.1007/978-3-030-33676-9_38
  10. Elanattil S., Moghadam P. Synthetic data for non-rigid 3D reconstruction using a moving RGB-D camera. CSIRO, Data Collection, 2018, vol. 2. https://doi.org/10.25919/5b7b60176d0cd
  11. Wang Q. A survey of visual analysis of human motion and its applications. arXiv.org, 2016, arXiv:1608.00700.
  12. Aggarwal J., Cai Q. Human motion analysis: A Review. Computer Vision and Image Understanding, 1999, vol. 73, no. 3, pp. 428–440. https://doi.org/10.1006/cviu.1998.0744
  13. Kok M., Eckhoff K., Weygers I., Seel T. Observability of the relative motion from inertial data in kinematic chains. arXiv.org, 2021, arXiv:2102.02675.
  14. Eriksson D., Harstrom J. Object detection by cluster analysis on 3D-points from a LiDAR sensor. Master’s thesis in Systems, Control and Mechatronics. Chalmers University of Technology, Sweden, 2019. Available at: https://odr.chalmers.se/bitstream/20.500.12380/257323/1/257323.pdf (accessed: 07.04.2021).
  15. Egorov Y.A. Research of effectiveness of classical approaches for solving the problem of human pose classification using skeletal model. Information Technologies and Systems. 8th Annual International Workshop, 2019, pp. 148–151. (in Russian)
  16. Kataev M.Yu., Kataeva N.G., Korobko A.P., Shaymardanov T.M. Methodology to build a frontal skeletal model of a human figure during walking using images. Proceedings of TUSUR University, 2017, vol. 20, no. 4, pp. 109–112. (in Russian). https://doi.org/10.21293/1818-0442-2017-20-4-109-112
  17. Vaganov S. E. A method for dynamic segmentation of a pair of sequental video-frames. Computer Optics, 2019, vol. 43, no. 1, pp. 83–89. (in Russian). https://doi.org/10.18287/2412-6179-2019-43-1-83-89
  18. Driggers R.G., Cox P.G., Kelley M. National imagery interpretation rating system and the probabilities of detection, recognition, and identification. Optical Engineering, 1997, vol. 36, no. 7, pp. 1952–1959. https://doi.org/10.1117/1.601381


Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License
Copyright 2001-2024 ©
Scientific and Technical Journal
of Information Technologies, Mechanics and Optics.
All rights reserved.

Яндекс.Метрика