Keywords: speech synthesis, voice restoration, hidden Markov models, Unit Selection, speech modification
References
1. Breuer S., Bergmann S., Dragon R., Möller S. Set-up of a unit-selection synthesis with a prominent voice. Proc. 5th International conference on Language Resources and Evaluation. Genoa, 2006, pp. 293–296.
2. Matoušek J., Tihelka D., Šmídl L. On the impact of annotation errors on unit-selection speech synthesis. Lecture Notes in Computer Science, 2012, vol. 7499, pp. 456–463. doi: 10.1007/978-3-642-32790-2_55
3. Yamagishi J., Zen H., Toda T., Tokuda K. Speaker-independent HMM-based speech synthesis system – HTS-2007 system for the blizzard challenge 2007. Proc. Blizzard Challenge-2007. Bonn, Germany, 2007, pp. 1–6.
4. Hunt A.J., Black A.W. Unit selection in a concatenative speech synthesis using a large speech database. Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 96. Atlanta, USA, 1996, vol. 1, pp. 373–376.
5. Phung T.-N., Mai C.L., Akagi M. A concatenative speech synthesis for monosyllabic languages with limited data. Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2012. Hollywood, US, 2012, pp. 1–10.
6. Meng F., Wu Z., Meng H., Jia J., Cai L. Hierarchical english emphatic speech synthesis based on HMM with limited training data. Proc. 13th Annual Conference of the International Speech Communication Association 2012, InterSpeech 2012. Portland, US, 2012, vol. 1, pp. 466–469.
7. Tsuzuki R., Zen H., Tokuda K., Kitamura T., Bulut M., Narayanan S. Constructing emotional speech synthesizers with limited speech database. Proc. INTERSPEECH 2004-ICSLP. Jeju Island, Korea, 2004, pp. 1185–1188.
8. Phung T. N., Luong M. C., Akagi M. A hybrid TTS between unit selection and HMM-based TTS under limited data conditions. Proc. 8th ISCA Speech Synthesis Workshop. Barcelona, Spain, 2013, pp. 279–284.
9. Chistikov P.G., Korolkov E.A., Talanov A.O. Combining HMM and unit selection technologies to increase naturalness of synthesized speech. Komp'yuternaya Lingvistika i Intellektual'nye Tekhnologii, 2013, no. 12-2, pp. 2–10.
10. Chistikov P.G., Korolkov E.A. Talanov A.O., Solomennik A.I. Gibridnaya tekhnologiya sinteza rechi na osnove skrytykh markovskikh modelei i algoritma Unit Selection [A hybrid technology for TTS system based on hidden markov models and unit selection algorithm]. Izv. vuzov. Priborostroenie, 2013, vol. 56, no. 2, pp. 33–38.
11. Solomennik A.I., Talanov A.O., Solomennik M.V., Khomitsevich O.G., Chistikov P.G. Otsenki kachestva sintezirovannoi rechi: problemy i resheniya [Assessment of synthesized speech quality: problems and solutions]. Izv. vuzov. Priborostroenie, 2013, vol. 56, no. 2, pp. 38–42.
12. Chistikov P.G., Khomitsevich O.G., Rybin S.V. Statisticheskie metody avtomaticheskogo opredeleniya mest i dlitel'nosti pauz v sistemakh sinteza rechi [Statistical methods for automatic prosodic break detection in a text-to-speech system]. Izv. vuzov. Priborostroenie, 2014, vol. 57, no. 2, pp. 28–32.
13. Chistikov P.G., Korolkov E.A. Data-driven speech parameter generation for Russian text-to-speech system. Komp'yuternaya Lingvistika i Intellektual'nye Tekhnologii, 2012, no. 11, pp. 103–111.
14. Chistikov P., Khomitsevich O. Improving prosodic break detection in a Russian TTS system. Proc. 15th International Conference on Speech and Computer, SPECOM 2013. Pilsen, Czech Republic, 2013, vol. 8113, pp. 181–188. doi: 10.1007/978-3-319-01931-4_24
15. Zen H., Tokuda K., Masuko T., Kobayashi T., Kitamura T. A hidden semi-Markov model-based speech synthesis. IEICE Transactions on Information and Systems, 2007, vol. E90-D, pp. 825–834. doi: 10.1093/ietisy/e90-d.5.825
16. Yamagishi J., Kobayashi T. Adaptive training for hidden semi-Markov model. Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP'05. Philadelphia, US, 2005, vol. 1, art. no. 141526, pp. I365–I368. doi: 10.1109/ICASSP.2005.1415126
17. Taylor P. Text-to-Speech Synthesis. Cambridge University Press, 2009, 626 p.
18. GOST R 50840-95.Peredacha rechi po traktam svyazi. Metody otsenki kachestva, razborchivosti i uznavaemosti [State Standard 50840-95. Speech transmission over varies communication channels. Techniques for measurements of speech quality, intelligibility and voice identification]. Moscow, Izdatel'stvo standartov Publ., 1996, 234 p.