DOI: 10.17586/2226-1494-2018-18-3-428-436


G. M. Lavrentyeva, S. A. Novoselov, A. V. Kozlov, O. Y. Kydashev, V. L. Shchemelinin, Y. N. Matveev, M. De Marsico

Read the full article 
Article in Russian

For citation: Lavrentyeva G.M., Novoselov S.A., Kozlov A.V., Kudashev O.Yu., Shchemelinin V.L., Matveev Yu.N., De Marsico M. Audio-replay attacks spoofing detection for speaker recognition systems. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2018, vol. 18, no. 3, pp. 428–436 (in Russian). doi: 10.17586/2226-1494-2018-18-3-428-436


Subject of Research. The present work considers the problem of detecting replay attacks on voice biometric systems. Due to their simplicity, these attacks are more likely to be used by the imposters, and that is why they are of special risk. This work describes the system for detecting replay attacks that was presented on the Automatic Speaker Verification Spoofing and Countermeasures (ASVspoof) Challenge 2017 focused on this problem.Method.  We study the efficiency of deep learning approach in the described task, in particular, convolutional neural networks with Max-Feature-Map activation function. Main Results. Experimental results obtained on the Challenge corpora have demonstrated high performance of such approach in contrast to current state-of-the-art baseline systems. Our primary system achieved 6.73% EER on the evaluation part of the corpora which is 72% relative improvement over the ASVspoof 2017 baseline system. Practical Relevance. The results of the work can be applied in the field of voice biometrics. The presented methods can be used in systems of automatic speaker verification and identification for detecting spoofing attacks on them.

Keywords: spoofing, replay attack detection, CNN, RNN, ASVspoof

Acknowledgements. This work was financially supported by the Ministry of Education and Science of the Russian Federation, Contract 14.578.21.0189 from 3.10.2016 (ID RFMEFI57816X0189).

  1. Sebastien M., Nixon M.S., Li S.Z. Handbook of Biometric Anti-Spoofing: Trusted Biometrics under Spoofing Attacks. Springer, 2014, 281 p. doi: 10.1007/978-1-4471-6524-8
  2. Faundez-Zanuy M., Hagmuller M., Kubin G. Speaker verification security improvement by means of speech water-marking. Speech Communication, 2006, vol. 48, no. 12, pp. 1608–1619. doi: 10.1016/j.specom.2006.06.010
  3. Wu Z., Evans N., Kinnunen T., Yamagishi J., Alegre F., Li H. Spoofing and countermeasures for speaker verification: a survey. Speech Communication, 2005, vol. 66, pp. 130–153.doi: 10.1016/j.specom.2014.10.005
  4. Wu Z., Kinnunen T., Evans N., Yamagishi J., Hanilci C., Sahidullah M., Sizov A. ASVspoof: the automatic speaker verification spoofing and countermeasures challenge. IEEE Journal of Selected Topics in Signal Processing, 2017, vol. 11, no. 4, pp. 588–604. doi: 10.1109/JSTSP.2017.2671435
  5. Villalba J., Lleida E. Preventing replay attacks on speaker verification systems. Proc. IEEE Int. Carnahan Conf. on Security Technology. Barcelona, Spain, 2011, 8 p. doi: 10.1109/CCST.2011.6095943
  6. Kinnunen T., Sahidullah M., Delgado H., Todisco M., Evans, N., Yamagishi J.,Lee K.A. The ACVspoof 2017 challenge: Assessing the limits of replay spoofing
    attack detection. Proc. of Interspeech. Stockholm, Sweden, 2017, pp. 2–6. doi: 10.21437/Interspeech.2017-1111
  7. Karpathy A., Toderici G., Shetty S., Leung T., Sukthankar R., Li F.F. Large-scale video classification with convolutional neural networks. Proc. IEEE Conf. on Computer Vision and Pattern Recognition. Columbus, USA, 2014, pp. 1725–1732. doi: 10.1109/CVPR.2014.223
  8. Bengio Y., Courville A., Vincent P. Representation learning: a review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, vol. 35, no. 8, pp. 1798–1828. doi: 10.1109/TPAMI.2013.50
  9. Krizhevsky A., Sutskever I., Hinton G. E. ImageNet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, 2012, vol. 2, pp. 1097–1105.
  10. Taigman Y., Yang M., Ranzato M., Wolf L. Deepface: Closing the gap to human-level performance in face verification. Proc. IEEE Conference on Computer Vision and Pattern Recognition. Columbus, USA, 2014, pp. 1701–1708. doi: 10.1109/CVPR.2014.220
  11. Volkova S.S., Matveev Yu.N. Convolutional neural networks for face anti-spoofing. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2017, vol. 17, no. 4, pp. 702–710 (in Russian). doi: 10.17586/2226-1494-2017-17-4-702-710
  12. Zhang C., Yu C., Hansen J.H.L. An investigation of deep-learning frameworks for speaker verification anti-spoofing. IEEE Journal of Selected Topics in Signal Processing, 2017, vol. 11, no. 4, pp. 684–694. doi: 10.1109/JSTSP.2016.2647199
  13. Tian X., Xiao X., Siong C. E., Li H. Spoofing speech detection using temporal convolutional neural network. Proc. of Asia-Pacific Signal and Information Processing Association Annual Summit and Conference. Jeju, South Korea, 2016. doi: 10.1109/APSIPA.2016.7820738
  14. Lee K.A., Larcher A., Wang G. et al. The RedDots data collection for speaker recognition. Proc. of Interspeech. Dresden, Germany, 2015, pp. 2996–3000.
  15. Todisco M., Delgado H., Evans N. A new feature for automatic speaker verification antispoofing: Constant Q cepstral coefficients. Proc. Odyssey. Bilbao, Spain, 2016. doi: 10.21437/odyssey.2016-41
  16. Lavrentyeva G., Novoselov S., Malykh E., Kozlov A., Kudashev O., Shchemelinin V. Audio replay attack detection with deep learning frameworks. Proc. of Interspeech. Stockholm, Sweden, 2017, pp. 82–86. doi: 10.21437/Interspeech.2017-360
  17. Wu X., He R., Sun Z., Tan T. A light CNN for deep face representation with noisy labels. arXiv: 1511.02683, 2015, 13 p.
  18. Chung J., Gulcehre C., Cho K., Bengio Y. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv:1412.3555, 2014.

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License
Copyright 2001-2019 ©
Scientific and Technical Journal
of Information Technologies, Mechanics and Optics.
All rights reserved.