doi: 10.17586/2226-1494-2018-18-4-663-668


DETECTION OF SPOOFING ATTACKS ON SPEAKER VERIFICATION SYSTEMS IN TELEPHONE CHANNEL

G. M. Lavrentyeva


Read the full article  ';
Article in Russian

For citation: Lavrentyeva G.M. Detection of spoofing attacks on speaker verification systems in telephone channel. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2018, vol. 18, no. 4, pp. 663–668 (in Russian). doi: 10.17586/2226-1494-2018-18-4-663-668

Abstract

Subject of Research. The present paper is devoted to the attacks detection problem  on voice biometric systems (spoofing-attacks) in telephone channel. Nowadays, spoofing detection is under the high interest in the field of voice speaker authentication. The results of the Automatic Speaker Verification Spoofing and Countermeasures Challenge in 2015 and 2017 dedicated to isolated task of spoofing detection confirmed the high perspectives in detection of unknown types of attacks in microphone channel. However, similar task in telephone channel remains extremely relevant, for example, in the banking sector. Method. The aim of the work was to study the applicability of deep learning approach for described problem solution, in particular, convolutional neural networks with the Max-Feature-Map activation function.Main Results.The experiments performed for real telephone attacks showed insufficient efficiency of the systems trained on data with emulated telephone channel. That is why, the database of real spoofing attacks in telephone channel was collected. The best system demonstrated 1.5% equal error rate (EER) on a subset of replay attacks, 1.7% for voice conversion attacks, and 2.8% for attacks with voice synthesis. Experiments show the need to consider different recording conditions, due to the great number of factors that have the influence on the channel. Practical Relevance.The results of the work can be applied in the field of voice biometrics. The presented methods can be used in systems of automatic speaker verification and identification for detection of spoofing attacks on them.


Keywords: spoofing detection, channel variation, CNN

Acknowledgements. The study was performed in the framework of the research project for applied research and experimental designs "Development of technology for automatic bimodal face and voice verification with protection against the use of false biometric samples". This work was financially supported by the Ministry of Education and Science of the Russian Federation, Contract 14.578.21.0189 dated 3/10/2016 (ID RFMEFI57816X0189).

References
 1.     Hautamki R., Kinnunen T., Hautamki V., Laukkanen A.-M. Automatic versus human speaker verification: the case of voice mimicry. Speech Communication, 2015, vol. 72,
pp. 13–31. doi: 10.1016/j.specom.2015.05.002
2.     Evans N., Kinnunen T., Yamagishi J. Spoofing and countermeasures for automatic speaker verification. Proc. of Interspeech. Lyon, France, 2013, pp. 925–929.
3.     Wu Z., Evans N., Kinnunen T., Yamagishi J., Alegre F., Li H. Spoofing and countermeasures for speaker verification: a survey. Speech Communication, 2015, vol. 66, pp. 130–153. doi: 10.1016/j.specom.2014.10.005
4.     Wu Z., Yamagishi J., Kinnunen T., Hanilci C., Sahidullah M., Sizov A., Evans N., Todisco M., Delgado H. ASVspoof: the automatic speaker verification spoofing and countermeasures challenge. IEEE Journal on Selected Topics in Signal Processing, 2017, vol. 11, no. 4, pp. 588–604. doi: 10.1109/JSTSP.2017.2671435
5.     Lavrentyeva G., Novoselov S., Malykh E., Kozlov A., Kudashev O., Shchemelinin V. Audio replay attack detection with deep learning frameworks. Proc. of Interspeech. Stockholm, Sweden, 2017, pp. 82–86. doi: 10.21437/Interspeech.2017-360
6.     Karpathy A., Toderici G., Shetty S., Leung T., Sukthankar R., Fei-Fei L. Large-scale video classification with convolutional neural networks. Proc. of IEEE Conf. on Computer Vision and Pattern Recognition. Columbus, USA,2014, pp. 1725–1732. doi: 10.1109/CVPR.2014.223
7.     Bengio Y., Courville A., Vincent P. Representation learning: a review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, vol. 35, no. 8, pp. 1798–1828. doi: 10.1109/TPAMI.2013.50
8.     Krizhevsky A., Sutskever I., Hinton G. E. ImageNet classification with deep convolutional neural networks. Advances Inneural Information Processing Systems. Lake Tahoe, USA,2012, pp. 1097–1105.
9.     Taigman Y., Yang M., Ranzato M., Wolf L. DeepFace: closing the gap to human-level performance in face verification. Proc. of IEEE Conference on Computer Vision and Pattern Recognition. Columbus, USA, 2014, pp. 1701–1708. doi: 10.1109/CVPR.2014.220
10.  Volkova S.S., Matveev Yu.N. Convolutional neural networks for face anti-spoofing. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2017, vol. 17, no. 4, pp. 702–710 (in Russian). doi: 10.17586/2226-1494-2017-17-4-702-710
11.  Delgado H., Todisco M., Evans N., Sahidullah M., Liu W.M., Alegre F., Kinnunen T., Fauve B. Impact of bandwidth and channel variation on presentation attack detection for speaker verification. Lecture Notes in Informatics. Darmstadt, Germany, 2017, art. 8053510.doi: 10.23919/BIOSIG.2017.8053510
12.  Chistikov P., Zakharov D., Talanov A. Improving speech synthesis quality for voices created from an audio book database. Lecture Notes in Computer Science, 2014, vol. 8773, pp. 276–283.
13.  Multi-channel system for registering telephone calls and voice messages Nezabudka II. Available at: https://www.speechpro.ru/product/sistemy-zapisi-telefonnykh-razgovorov/nezabudka-2 (accessed 05.06.2018).
14.  Multi-channel system of automatic notification of subscribers over telephone lines Rupor. Available at: https://www.speechpro.ru/product/sistemy-rechevogo-opovesheniya/rupor (accessed 05.06.2018).
15.  NIST Speaker Recognition Evaluation 2012 Database. Available at: https://www.nist.gov/itl/iad/mig/sre12-results (accessed 05.06.2018).
16.  Wu X., He R., Sun Z., Tan T. A light CNN for deep face representation with noisy labels. IEEE Journal of Selected Topics in Signal Processing, 2018, vol. 13, no. 11,
pp. 2884–2896. doi: 10.1109/TIFS.2018.2833032
17.  Simonchik K.K., Galinina O.S., Kapustin A.I Algorithm for detecting speech activity based on the pitch statistics in the task of recognizing the speaker. Nauchno-Tekhnicheskie Vedomosti SPbGPU, 2010,no. 4, pp. 18–23. (in Russian)
18.  Markov K., Nakagawa S. Discriminative training of GMM using a modified EM algorithm for speaker recognition. Proc. of International Speech Communication Association. Sydney, Australia, 1998.
19. Dyrmovsky D.V., Koval S.L., Khitrov M.V. Concept of the national voice accounting and voice biometric search system. Journal of Instrument Engineering, 2014, vol. 57, no. 2, pp. 63–70. (in Russian)


Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License
Copyright 2001-2025 ©
Scientific and Technical Journal
of Information Technologies, Mechanics and Optics.
All rights reserved.

Яндекс.Метрика