doi: 10.17586/2226-1494-2024-24-2-190-197


Fast labeling pipeline approach for a huge aerial sensed dataset

A. M. Fedulin, N. V. Voloshina


Read the full article  ';
Article in English

For citation:
Fedulin A.M., Voloshina N.V. Fast labeling pipeline approach for a huge aerial sensed dataset. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2024, vol. 24, no. 2, pp. 190–197. doi: 10.17586/2226-1494-2024-24-2-190-197


Abstract
Modern neural network technologies are actively used for Unmanned Aerial Vehicles (UAVs). Convolutional Neural Networks (CNN), are mostly used for object detection, classification, and tracking tasks, for example, for such objects as fires, deforestations, buildings, cars, or people. However, to improve effectiveness of CNNs it is necessary to perform their fine-tuning on new flight data periodically. Such training data should be labeled, which increases total CNN fine- tuning time. Nowadays, the common approach to decrease labeling time is to apply auto-labeling and labeled objects tracking. These approaches are not effective enough for labeling of 8 hours’ huge aerial sensed datasets that are common for long-endurance USVs. Thus, reducing data labeling time is an actual task nowadays. In this research, we propose a fast aerial data labeling pipeline especially for videos gathered by long-endurance UAVs cameras. The standard labeling pipeline was supplemented with several steps such as overlapped frames pruning, final labeling spreading over video frames. The other additional step is to calculate a Potential Information Value (PIV) for each frame as a cumulative estimation of frame anomality, frame quality, and auto-detected objects. Calculated PIVs are used than to sort out frames. As a result, an operator who labels video gets informative frames at the very beginning of the labeling process. The effectiveness of proposed approach was estimated on collected datasets of aerial sensed videos obtained by long-endurance UAVs. It was shown that it is possible to decrease labeling time by 50 % in average in comparison with other modern labeling tools. The percentage of average number of labeled objects was 80 %, with them being labeled for 40 % of total pre-ranged frames. Proposed approach allows us to decrease labeling time for a new long-endurance flight video data significantly. This makes it possible to speed up neural network fine-tuning process. As a result, it became possible to label new data during the inter-flight time that usually takes about two or three hours and is too short for other labeling instruments. Proposed approach is recommended to decrease UAVs operators working time and labeled dataset creating time that could positively influence on the time necessary for the fine-tuning a new effective CNN models.

Keywords: fast labeling pipeline, FLP, unmanned aerial vehicle, UAVs, long-endurance UAVs, adversarial attack, frames potential information value, PIV

References
  1. Zhao Z., Zheng P., Xu S., Wu X. Object detection with deep learning: A review. arXiv, 2019, arXiv:1807.05511. https://doi.org/10.48550/arXiv.1807.05511
  2. Ren S., He K., Girshick R., Sun J. Faster R-CNN: Towards real-time object detection with region proposal networks. Advances in Neural Information Processing Systems, 2015, vol. 28, pp. 91–99.
  3. Liu S., Liu Z. Multi-channel CNN-based object detection for enhanced situation awareness. Sensors & Electronics Technology (SET) panel Symposium SET-241 on 9th NATO Military Sensing Symposium, 2017.
  4. Mahalanobis A., McIntosh B. A comparison of target detection algorithms using DSIAC ATR algorithm development data set. Proceedings of SPIE, 2019, vol. 10988, pp. 1098808. https://doi.org/10.1117/12.2517423
  5. Liu W., Anguelov D., Erhan D., Szegedy C., Reed S., Fu C.-Y., Berg A. SSD: Single shot multibox detector. Lecture Notes in Computer Science, 2016, vol. 9905, pp. 21–37. https://doi.org/10.1007/978-3-319-46448-0_2
  6. Ronneberger O., Fischer P., Brox T. U-Net: Convolutional networks for biomedical image segmentation. Lecture Notes in Computer Science, 2015, vol. 9351, pp. 234–241. https://doi.org/10.1007/978-3-319-24574-4_28
  7. Redmon J. Farhadi A. YOLO9000: better, faster, stronger. Proc. of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 6517–6525. https://doi.org/10.1109/CVPR.2017.690
  8. Chen H-W., Reyes M., Marquand B., Robie D. Advanced automated target recognition (ATR) and multi-target tracker (MTT) with electro-optical (EO) sensors. Proceedings of SPIE, 2020, vol. 11511, pp. 115110V. https://doi.org/10.1117/12.2567178
  9. Redmon J., Farhadi A. YOLOv3: An Incremental Improvement. arXiv, 2018, ar.Xiv:1804.02767v1. https://doi.org/10.48550/arXiv.1804.02767
  10. Wang C.Y., Bochkovskiy A., Liao M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. Proc. of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023. https://doi.org/10.1109/CVPR52729.2023.00721
  11. Fedulin A.M., Evstaf'ev D.V., Kondrashova G.L., Artemenko N.V. Human-autonomy teaming interface design for multiple-UAV control. Russian Aeronautics, 2022, vol. 65, no. 2, pp. 419–424. https://doi.org/10.3103/S1068799822020222
  12. Barnell M., Raymond C., Capraro Ch., Isereau D., Cicotta Ch., Stokes N. High-performance computing (HPC) and machine learning demonstrated in flight using agile condor. Proc. of the 2018 IEEE High Performance extreme Computing Conference (HPEC), 2018, pp. 1–4. https://doi.org/10.1109/HPEC.2018.8547797
  13. Fedulin A.M., Driagin D.M. Prospects of MALE-class UAVs using for the huge territories aerial survey. Izvestiya SFedU. Engineering Sciences, 2021, no. 1(218), pp. 271–281. (in Russian). https://doi.org/10.18522/2311-3103-2021-1-271-281
  14. Fei-Fei L., Fergus R., Perona P. One-shot learning of object categories. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, vol. 28, no. 4, pp. 594–611. https://doi.org/10.1109/tpami.2006.79
  15. Fink M. Object classification from a single example utilizing class relevance metrics. Advances in Neural Information Processing Systems, 2004, vol. 17, pp. 449–456.
  16. Alajaji D., Alhichri H.S., Ammour N., Alajlan N. Few-shot learning for remote sensing scene classification. Proc. of the 2020 Mediterranean and Middle-East Geoscience and Remote Sensing Symposium (M2GARSS), pp. 81–84. https://doi.org/10.1109/M2GARSS47143.2020.9105154
  17. Sager Ch., Janiesch Ch., Zschech P. A survey of image labelling for computer vision applications. Journal of Business Analytics, 2021, vol. 4, no. 2, pp. 91–110. https://doi.org/10.1080/2573234X.2021.1908861
  18. Oprea A., Vassilev A., Fordyce A., Anderson H. Adversarial Machine Learning: A Taxonomy and Terminology of Attacks and Mitigations: Report NIST AI100-2E2023. 107 p. https://doi.org/10.6028/NIST.AI.100-2e2023
  19. Choi J.I., Tian Q. Adversarial attack and defense of YOLO detectors in autonomous driving scenarios. Proc. of the 2022 IEEE Intelligent Vehicles Symposium (IV), 2022, pp. 1011–1017. https://doi.org/10.1109/IV51971.2022.9827222
  20. Wang C.Y., Bochkovskiy A., Liao H.Y.M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023, pp. 7464–7475. https://doi.org/10.1109/cvpr52729.2023.00721
  21. Defard T., Setkov A., Loesch A., Audigier R. PaDiM: A Patch distribution modeling framework for anomaly detection and localization. Lecture Notes in Computer Science, 2021, vol. 12664, pp. 475–489. https://doi.org/10.1007/978-3-030-68799-1_35
  22. Ilg E., Mayer N., Saikia T., Keuper M., Dosovitskiy A., Brox T. Flownet 2.0: Evolution of optical flow estimation with deep networks. Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 1647–1655. https://doi.org/10.1109/cvpr.2017.179
  23. Guillermo M., Billones R.K., Bandala A., Vicerra R.R., Sybingco E., Dadios E.P., Fillone A. Implementation of automated annotation through mask RCNN Object Detection Model in CVAT using AWS EC2 Instance. Proc.of the 2020 IEEE REGION 10 CONFERENCE (TENCON), 2020, pp. 708–713. https://doi.org/10.1109/tencon50793.2020.9293906


Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License
Copyright 2001-2024 ©
Scientific and Technical Journal
of Information Technologies, Mechanics and Optics.
All rights reserved.

Яндекс.Метрика