Acceso abierto

Evaluation of speaker de-identification based on voice gender and age conversion


Cite

[1] S. Ribaric, A. Ariyaeeinia and N. Pavesic, “De-identification for privacy protection in multimedia content: A survey”, Signal Processing: Image Communication, 2016, 47, 131–151.10.1016/j.image.2016.05.020Search in Google Scholar

[2] A. Sayadian and F. Mozaffari, “A novel method for voice conversion based on non-parallel corpus”, International Journal of Speech Technology, 2017, 20, (3), 587–592.10.1007/s10772-017-9430-4Search in Google Scholar

[3] H. Valbret, E. Moulines and J. P. Tubach, “Voice transformation using PSOLA technique”, Speech Communication, 1992, 11, (2-3), 175–187.10.1016/0167-6393(92)90012-VOpen DOISearch in Google Scholar

[4] Q. Jin, A. R. Toth, T. Schultz et al, “Voice convergin: Speaker de-identification by voice transformation”, Proc. 2009 IEEE Int. Conf. Acoustics, Speech, and Signal Processing (ICASSP 2009), Taipei, Taiwan, April 2009, pp. 3909–3912.10.1109/ICASSP.2009.4960482Search in Google Scholar

[5] T. Justin, V. Štruc, S. Dobrišek et al, “Speaker de-identification using diphone recognition and speech synthesis”, Proc. 11th IEEE Int. Conf. and Workshops Automatic Face and Gesture Recognition (FG 2015), Ljubljana, Slovenia, May 2015, pp. 1–7.10.1109/FG.2015.7285021Search in Google Scholar

[6] M. Faundez-Zanuy, E. Sesa-Nogueras and S. Marinozzi, “Speaker identification experiments under gender de-identification”, xperiments under gender de-identification. Proc. 49th Annual IEEE Int. Carnahan Conf. Security Technology ICCST 2015, Taipei, Taiwan, September 2015, pp. 309–314.10.1109/CCST.2015.7389702Search in Google Scholar

[7] C. Magarinos, P. Lopez-Otero, L. Docio-Fernandez et al, “Reversible speaker de-identification using pre-trained transformation functions”, Computer Speech and Language, 2017, 46, pp. 36–52.10.1016/j.csl.2017.05.001Open DOISearch in Google Scholar

[8] M. Abou-Zleikha, Z. -H. Tan, M. G. Christensen et al, “A discriminative approach for speaker selection in speaker de-identification systems”, Proc. 23rd European Signal Processing Conf. (EUSIPCO 2015), Nice, France, August 2015, pp. 2102-2106.10.1109/EUSIPCO.2015.7362755Search in Google Scholar

[9] R. Vích, J. Přibil and Z. Smékal, “New cepstral zero-pole vocal tract models for TTS synthesis”, Proc. IEEE Region 8 EURO-CON’2001; vol. 2, Section S22-Speech Compression and DSP, Bratislava, Slovakia, July 2001, pp. 458–62.Search in Google Scholar

[10] D. A. Reynolds and R. C. Rose, “Robust text-independent speaker identification using Gaussian mixture speaker models”, IEEE Transactions on Speech and Audio Processing, 1995, 3, 72–83.10.1109/89.365379Search in Google Scholar

[11] F. Burkhardt, A. Paeschke, M. Rolfes et al, “A database of German emotional speech”, Proc. 9th European Conf. Speech Communication and Technology (INTERSPEECH 2005), Lisbon, Portugal, September 2005, pp. 1517–1520.10.21437/Interspeech.2005-446Search in Google Scholar

[12] P. Klosowski, A. Dustor and J. Izydorczyk, “Speaker verification performance evaluation based on open source speech processing software and TIMIT speech corpus”, P. Gaj et al, Communications in Computer and Information Science 522 (Springer International Publishing Switzerland, 2015), pp. 400–409.10.1007/978-3-319-19419-6_38Search in Google Scholar

[13] M. Fleischer, S. Pinkert, W. Mattheus et al, “Formant frequencies and bandwidths of the vocal tract transfer function are affected by the mechanical impedance of the vocal tract wall”, Biomechanics and Modeling in Mechanobiology, 2015, 14, (4), 719–733.10.1007/s10237-014-0632-2449017825416844Search in Google Scholar

[14] M. P. Gelfer and Q. E. Bennett, “Speaking fundamental frequency and vowel formant frequencies: Effects on perception of gender”, Journal of Voice, 2013, 27, (5), 556–566.10.1016/j.jvoice.2012.11.00823415148Open DOISearch in Google Scholar

[15] K. Pisanski, B. C. Jones, B. Fink et al. “Voice parameters predict sex-specific body morphology in men and women”, Animal Behaviour, 2016, 112, 13–32.10.1016/j.anbehav.2015.11.008Search in Google Scholar

[16] U. Reubold, J. Harrington and F. Kleber, “Vocal aging effects on F0 and the first formant: A longitudinal analysis in adult speakers”, Speech Communication, 2010, 52, (7-8), 638–651.10.1016/j.specom.2010.02.012Search in Google Scholar

[17] C. M. Bishop, “Pattern Recognition and Machine Learning”, Springer,.Search in Google Scholar

[18] G. Muhammad and K. Alghathbar, “Environment recognition for digital audio forensics using MPEG-7 and mel cepstral features”, Journal of Electrical Engineering, 2011, 62, (4), 199–205.10.2478/v10187-011-0032-0Search in Google Scholar

[19] J. Přibil and A. Přibilová, “GMM-based evaluation of emotional style transformation in Czech and Slovak”, Cognitive Computation, 2014, 6, (4), 928–939.10.1007/s12559-014-9283-ySearch in Google Scholar

[20] J. Přibil and A. Přibilová, “Comparison of text-independent original speaker recognition from emotionally converted speech”, A. Esposito et al, Smart Innovation, Systems and Technologies 2016, 48, pp. 137–149.10.1007/978-3-319-28109-4_14Search in Google Scholar

[21] J. Přibil an d A. Přibilová, J. Matoušek, “GMM-based speaker age and gender classification in Czech and Slovak”, Journal of Electrical Engineering, 2017, 68, (1), 3–12.10.1515/jee-2017-0001Search in Google Scholar

[22] B. Božilovic, B. M. Todorovic and M. Obradovic, “Text independent speaker recognition using two-dimensional information entropy”, Journal of Electrical Engineering, 2015, 66, (3), 169–173.10.2478/jee-2015-0027Search in Google Scholar

[23] I. T. Nabney, “Netlab Pattern Analysis Toolbox, Release 3”, http://www.aston.ac.uk/eas/research/groups/ncrg/resources/netlab/downloads, accessed 2 October 2015.Search in Google Scholar

eISSN:
1339-309X
Idioma:
Inglés
Calendario de la edición:
6 veces al año
Temas de la revista:
Engineering, Introductions and Overviews, other