TEDxSK and JumpSK: A New Slovak Speech Recognition Dedicated Corpus

[1] Koctúr, T., Juhár, J., Viszlay, P., Staš, J., and Lojka, M. (2016). Unsupervised speech transcription and alignment based on two complementary ASR systems. In Proceedings of RADIOELEKTRONIKA 2016, pages 358–362, Košice, Slovakia.10.1109/RADIOELEK.2016.7477435Search in Google Scholar

[2] Rosseau, A., Deléglise, P., and Estève, Y. (2012). TED-LIUM: An automatic speech recognition dedicated corpus. In Proceedings of LREC 2012, pages 125–129, Istanbul, Turkey.Search in Google Scholar

[3] Deléglise, P., Estève, Y., Meignier, S., and Merlin, T. (2009). Improvements to the LIUM French ASR system based on CMU Sphinx: What helps to significantly reduce the word error rate? In Proceedings of INTERSPEECH 2009, pages 2123–2126, Brighton, UK.10.21437/Interspeech.2009-607Search in Google Scholar

[4] Žgank, A., Maučec, M. S., Verdonik, D. (2016). The SI TEDx-UM speech database: A new Slovenian spoken language resource. In Proceedings of LREC 2016, pages 4670–4673, Portorož, Slovenia.Search in Google Scholar

[5] Rosseau, A., Deléglise, P., and Estève, Y. (2014). Enhancing the TED-LIUM corpus with selected data for language modeling and more TED talks. In Proceedings of LREC 2014, pages 3935–3939, Reykjavik, Iceland.Search in Google Scholar

[6] Leeuwis, E., Federico, M., and Cettolo, M. (2003). Language modeling and transcription of the TED corpus lectures. In Proceedings of ICASSP 2003, pages 232–235, Hong Kong, China.10.1109/ICASSP.2003.1198760Search in Google Scholar

[7] Cettolo, M., Brugnara, F. and Federico, M. (2004). Advances in the automatic transcription of lectures. In Proceedings of ICASSP 2004, pages 769–772, Montreal, Canada.10.1109/ICASSP.2004.1326099Search in Google Scholar

[8] Niesler, T. and Willet, D. (2002). Unsupervised language model adaptation for lecture speech transcription. In Proceedings of ICSLP 2002, pages 1413–1416, Denver, Colorado, USA.10.21437/ICSLP.2002-63Search in Google Scholar

[9] Wölfel, M. and Berger, S. (2005). The ISL baseline lecture transcription system for the TED corpus. Tech. Rep., Karlsruhe University, Germany.Search in Google Scholar

[10] Naptali, W. and Kawahara, T. (2012). Automatic transcription of TED talks. In Proceedings of the 6^th Spoken Document Processing Workshop, SDPWS 2012, Toyohashi, Japan.Search in Google Scholar

[11] Bell, P., Yamamoto, H., Swietojanski, P., Wu, Y., McInnes, F., Hori, Ch., and Renals, S. (2013). A lecture transcription system combining neural network acoustic and language models. In Proceedings of INTERSPEECH 2013, pages 3081–3091, Lyon, France.10.21437/Interspeech.2013-673Search in Google Scholar

[12] Nanjo, H., Shitaoka, K., and Kawahara, T. (2003). Automatic transformation of lecture transcription into document style using statistical framework. In Proceedings of ISCA & IEEE Workshop on Spontaneous Speech Processing and Recognition, SSPR 2003, Tokyo, Japan.Search in Google Scholar

[13] Hsu, B.-J. and Glass, J. (2009). Language model parameter estimation using user transcriptions. In Proceedings of ICASSP 2009, pages 4805–4808, Taipei, Taiwan.10.1109/ICASSP.2009.4960706Search in Google Scholar

[14] Akita, Y., Watanabe, M., and Kawahara, T. (2012). Automatic transcription of lecture speech using language model based on speaking-style transformation of proceedings texts. In Proceedings of INTERSPEECH 2012, pages 2326–2329, Portland, Oregon, USA.10.21437/Interspeech.2012-610Search in Google Scholar

[15] Viszlay, P., Staš, J., Koctúr, T., Lojka, M., and Juhár, J. (2016). An extension of the Slovak broadcast news corpus based on semi-automatic annotation. In Proceedings of LREC 2016, pages 4684–4687, Portorož, Slovenia.Search in Google Scholar

[16] Vavrek, J., Viszlay, P., Kiktová, E., Lojka, M., Juhár, J., and Čižmár, A. (2014). Query-by-example retrieval via fast sequential dynamic time warping algorithm. In Proceedings of the 37^th International Conference on Telecommunications and Signal Processing, TSP 2014, pages 453–457, Berlin, Germany.Search in Google Scholar

[17] Staš, J., Viszlay, P., Lojka, M., Koctúr, T., Hládek, D., Kiktová, E., Pleva, M., and Juhár, J. (2015). Automatic subtitling system for transcription, archiving and indexing of Slovak audiovisual recordings. In Proceedings of the 7^th Language & Technology Conference, LTC 2015, pages 186–191, Poznań, Poland.Search in Google Scholar

[18] Lee, A., Kawahara, T., and Shikano, K. (2001). Julius – An open source real-time large vocabulary recognition engine. In Proceedings of EUROSPEECH 2001, pages 1691–1694, Aalborg, Denmark.10.21437/Eurospeech.2001-396Search in Google Scholar

[19] Lojka, M., Ondáš, S., Pleva, M., and Juhár, J. (2014). Multi-threaded parallel speech recognition for mobile applications. Journal of Electrical and Electronics Engineering, 7(1):81–86.Search in Google Scholar

[20] Rusko, M., Juhár, J., Trnka, M., Staš, J., Darjaa, S., Hládek, D., Sabo, R., Pleva, M., Ritomský, M., and Ondáš, S. (2016). Advances in the Slovak judicial domain dictation system. In Vertulani, Z., Uszkoreit, H., and Kubis, M., editors, Human Language Technology: Challenges for Computer Science and Linguistics, LNAI 9561, pages 55–67, Springer International Publishing Switzerland.10.1007/978-3-319-43808-5_5Search in Google Scholar

[21] Koctúr, T., Staš, J., and Juhár, J. (2016). Unsupervised acoustic corpora building based on variable confidence measure thresholding. In Proceedings of the 58^th International Symposium ELMAR 2016, pages 31–34, Zadar, Croatia.10.1109/ELMAR.2016.7731748Search in Google Scholar

[22] Darjaa, S., Cerňak, M., Trnka, M., and Rusko, M. (2011). Effective triphone mapping for acoustic modeling in speech recognition. In Proceedings of INTERSPEECH 2011, pages 1717–1720, Florence, Italy.10.21437/Interspeech.2011-190Search in Google Scholar

[23] Stolcke, A. (2002). SRILM – An extensible language modeling toolkit. In Proceedings of ICSLP 2002, pages 901–904, Denver, Colorado, USA.10.21437/ICSLP.2002-303Search in Google Scholar

[24] Staš, J. and Juhár, J. (2015). Modeling of the Slovak language for broadcast news transcription. Journal of Electrical and Electronics Engineering, 8(2):43–46.Search in Google Scholar

[25] Hládek, D., Ondáš, S., and Staš, J. (2014). Online natural language processing of the Slovak language. In Proceedings of the 5^th IEEE International Conference on Cognitive InfoCommunications, CogInfoCom 2014, pages 315–316, Vietri sul Mare, Italy.10.1109/CogInfoCom.2014.7020469Search in Google Scholar

[26] Fiscus, J. G. (1997). A post-processing system to yield reduced word error rates: Recognizer output voting error reduction (ROVER). In Proceedings of ASRU 1997, pages 347–352, Santa Barbara, CA, USA.10.1109/ASRU.1997.659110Search in Google Scholar

[27] Lojka, M. and Juhár, J. (2014). Hypothesis combination for Slovak dictation speech recognition. In Proceedings of the 56^th International Symposium ELMAR 2014, pages 43–46, Zadar, Croatia.10.1109/ELMAR.2014.6923311Search in Google Scholar

[28] Staš, J., Hládek, D, and Juhár, J. (2016). Adding filled pauses and disfluent events into language models for speech recognition. In Proceedings of the 7^th IEEE International Conference on Cognitive InfoCommunications, CogInfoCom 2016, Wroclaw, Poland.10.1109/CogInfoCom.2016.7804538Search in Google Scholar

eISSN:: 1338-4287
ISSN:: 0021-5597
Language:: English

Publication timeframe:: 2 times per year
Journal Subjects:: Linguistics and Semiotics, Theoretical Frameworks and Disciplines, Linguistics, other

Journal RSS Feed

TEDxSK and JumpSK: A New Slovak Speech Recognition Dedicated Corpus

Published Online: Jan 24, 2018

Page range: 346 - 354

DOI: https://doi.org/10.1515/jazcas-2017-0044

Keywordsautomatic annotation, speech recognition, speech corpus

© 2017 Ján Staš et al., published by De Gruyter Open

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.

Keywords
automatic annotation, speech recognition, speech corpus