doi: 10.17586/2226-1494-2021-21-1-92-101


GOODPOINT: UNSUPERVISED LEARNING OF KEY POINT DETECTION AND DESCRIPTION 

A. V. Belikov, A. S. Potapov, A. V. Yashchenko


Read the full article  ';
Article in English

For citation:
Belikov A.V., Potapov A.S., Yashchenko A.V. Goodpoint: unsupervised learning of key point detection and description. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2021, vol. 21, no. 1, pp. 92–101.
doi: 10.17586/2226-1494-2021-21-1-92-101


Abstract
Subject of Research. The paper presents the study of algorithms for key point detection and description, widely used in computer vision. Typically, the corner detector acts as a key point detector, including neural key point detectors. For some types of images obtained in medicine, the application of such detectors is problematic due to the small number of detected key points. The paper considers a problem of a neural network key point detector training on unlabeled images. Method. We proposed the definition of key points not depending on specific visual features. A method was considered for training of a neural network model meant for detecting and describing key points on unlabeled data. The application of homographic image transformation was basic to the method. The neural network model was trained to detect the same key points on pairs of noisy images related to a homographic transformation. Only positive examples were used for detector training, just points correctly matched with features produced by the neural network model for key point description. Main Results. The unsupervised learning algorithm is used to train the neural network model. For the ease of comparison, the proposed model has a similar architecture and the same number of parameters as the supervised model. Model evaluation is performed on the three different datasets: natural images, synthetic images, and retinal photographs. The proposed model shows similar results to the supervised model on the natural images and better results on retinal photographs. Improvement of results is demonstrated after additional training of the proposed model on images from the target domain. This is an advantage over a model trained on a labeled dataset. For comparison, the harmonic average of such metrics is used as: the accuracy and the depth of matching by descriptors, reproducibility of key points and image coverage. Practical Relevance. The proposed algorithm makes it possible to train the neural network key point detector together with the feature extraction model on images from the target domain without costly dataset labeling and reduce labor costs for the development of the system that uses the detector.

Keywords: unsupervised learning, deep learning, key points detection, local features

References
1. Harris C., Stephens M. A combined corner and edge detector. Proc. of the Alvey Vision Conference, UK, Manchester, 1988, pp. 23.1–23.6. doi: 10.5244/C.2.23
2. Funayama R., Yanagihara H., Van Gool L., Tuytelaars T., Bay H. Robust interest point detector and descriptor. Patent US8165401 B2, 2012.
3. Rosten E., Drummond T. Machine learning for high-speed corner detection. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2006, vol. 3951, pp. 430–443. doi: 10.1007/11744023_34
4. Sarlin P.E., DeTone D., Malisiewicz T., Rabinovich A. SuperGlue: Learning feature matching with graph neural networks. Proc. of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2020, pp. 4937–4946. doi: 10.1109/CVPR42600.2020.00499
5. Mitchell T.M. Machine Learning. McGraw Hill, 1997, 414 p.
6. DeTone D., Malisiewicz T., Rabinovich A. Superpoint: Self- supervised interest point detection and description. Proc. of the IEEE/ CVF Conference on Computer Vision and Pattern Recognition Workshops, 2018, pp. 337–349.
doi: 10.1109/CVPRW.2018.00060
7. Truong P., Apostolopoulos S., Mosinska A., Stucky S., Ciller C., Zanet S.D. GLAMpoints: Greedily learned accurate match points. Proc. of the IEEE International Conference on Computer Vision, Korea, Seoul, 2019, pp. 10732–10741. doi: 10.1109/ICCV.2019.01083
8. Jakab T., Gupta A., Bilen H., Vedaldi A. Unsupervised learning of object landmarks through conditional image generation. Advances in Neural Information Processing Systems, 2018, pp. 4016–4027.
9. Kulkarni T.D., Gupta A., Ionescu C., Borgeaud S., Reynolds M., Zisserman A., Mnih V. Unsupervised learning of object keypoints for perception and control. Advances in Neural Information Processing Systems, 2019, vol. 32,
pp. 10723–10733.
10. Ono Y., Trulls E., Fua P., Yi K.M. LF-Net: learning local features from images. Advances in Neural Information Processing Systems, 2018, pp. 6234–6244.
11. Maas A.L., Hannun A.Y., Ng A.Y. Rectifier nonlinearities improve neural network acoustic models. Proc. 30th International Conference on Machine Learning, USA, Atlanta, 2013, pp. 3.
12. Loshchilov I., Hutter F. Decoupled weight decay regularization. Proc. 7th International Conference on Learning Representations (ICLR 2019), 2019.
13. Lin T.-Y., Maire M., Belongie S., Hays J., Perona P., Ramanan D., Dollár P., Zitnick C.L. Microsoft COCO: Common objects in context. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2014, vol. 8693, pp. 740–755. doi: 10.1007/978-3-319-10602-1_48
14. Yashchenko A.V., Belikov A.V., Peterson M.V., Potapov A.S. Distillation of neural network models for detection and description of image key points. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2020, vol. 20, no. 3, pp. 402– 409. (in Russian). doi: 10.17586/2226-1494-2020-20-3-402-409
15. Irschara A., Zach C., Frahm J.M., Bischof H. From structure-from- motion point clouds to fast location recognition. Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition , 2009, pp. 2599–2606.
16. Hernandez-Matas C., Zabulis X., Triantafyllou A., Anyfanti P., Douma S., Argyros A.A. FIRE: Fundus image registration dataset. Journal for Modeling in Ophthalmology, 2017, vol. 1, no. 4, pp. 16– 28.


Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License
Copyright 2001-2024 ©
Scientific and Technical Journal
of Information Technologies, Mechanics and Optics.
All rights reserved.

Яндекс.Метрика