TED-LIUM 3: Twice as Much Data and Corpus Repartition for Experiments on Speaker Adaptation

机译：TED-LIUM 3：用于说话人适应性实验的数据和语料库分配的两倍

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we present TED-LIUM release 3 corpus (TED-LIUM 3 is available on https://lium.univ-lemans.fr/ted-lium3/) dedicated to speech recognition in English, which multiplies the available data to train acoustic models in comparison with TED-LIUM 2, by a factor of more than two. We present the recent development on Automatic Speech Recognition (ASR) systems in comparison with the two previous releases of the TED-LIUM Corpus from 2012 and 2014. We demonstrate that, passing from 207 to 452 h of transcribed speech training data is really more useful for end-to-end ASR systems than for HMM-based state-of-the-art ones. This is the case even if the HMM-based ASR system still outperforms the end-to-end ASR system when the size of audio training data is 452 h, with a Word Error Rate (WER) of 6.7% and 13.7%, respectively. Finally, we propose two repartitions of the TED-LIUM release 3 corpus: the legacy repartition that is the same as that existing in release 2, and a new repartition, calibrated and designed to make experiments on speaker adaptation. Similar to the two first releases, TED-LIUM 3 corpus will be freely available for the research community.

机译：在本文中，我们介绍了专用于英语语音识别的TED-LIUM版本3语料库（TED-LIUM 3可在https://lium.univ-lemans.fr/ted-lium3/上获得），该语言将可用数据乘以与TED-LIUM 2相比，训练的声学模型要高出两倍以上。与2012年和2014年的TED-LIUM语料库的前两个版本相比，我们介绍了自动语音识别（ASR）系统的最新发展。我们证明，从207到452 h的转录语音训练数据传递确实更有用对于端到端ASR系统，要比基于HMM的最新系统更好。即使基于HMM的ASR系统在音频培训数据的大小为452 h，字错误率（WER）分别为6.7％和13.7％的情况下仍胜过端到端ASR系统时，情况仍然如此。最后，我们提出TED-LIUM版本3语料库的两个分区：与版本2中现有版本相同的旧分区，以及经过校准和设计用于进行说话人适应性实验的新分区。类似于两个第一个发行版，TED-LIUM 3语料库将免费提供给研究社区。

著录项

来源
《International Conference on speech and computer》|2018年|198-208|共11页
会议地点
作者
Francois Hernandez; Vincent Nguyen; Sahar Ghannay; Natalia Tomashenko; Yannick Esteve;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Speech recognition; Opensource corpus; Deep learning Speaker adaptation; TED-LIUM;

机译：语音识别;开源语料库;深度学习演讲者适应;泰德;

相似文献

外文文献
中文文献
专利

1. Speaker adaptations in sparse training data for improved speaker verification [J] . Sungjoo Ahn, Hanseok Ko Electronics Letters . 2000,第4期

机译：稀疏训练数据中的说话人适应性改善了说话人验证
2. Speaker verification by human listeners: Experiments comparing human and machine performance using the NIST 1998 speaker evaluation data [J] . Schmidt-Nielsen A., Crystal TH. Digital Signal Processing . 2000,第1a3期

机译：通过听众验证说话者：使用NIST 1998说话者评估数据比较人和机器性能的实验
3. Speaker verification by human listeners: Experiments comparing human and machine performance using the NIST 1998 speaker evaluation data [J] . Schmidt-Nielsen A., Crystal TH. Digital Signal Processing . 2000,第1a3期

机译：通过听众验证说话者：使用NIST 1998说话者评估数据比较人和机器性能的实验
4. TED-LIUM 3: Twice as Much Data and Corpus Repartition for Experiments on Speaker Adaptation [C] . Francois Hernandez, Vincent Nguyen, Sahar Ghannay, International Conference on Speech and Computer . 2018

机译：TED-lium 3：数据和语料库重新分配的两倍于扬声器适应的实验
5. The dynamic ecology of the writing process and agency: A corpus-based comparative case study of stancetaking among native speakers and non-native speakers of English in first-year composition conferences. [D] . Wilkins, Kirk. 2015

机译：写作过程和代理机构的动态生态：在第一年作文大会上，基于语料库的英语母语者和非英语母语者的立场比较案例研究。
6. Correction: Severity-Based Adaptation with Limited Data for ASR to Aid Dysarthric Speakers [O] . -1

机译：校正：基于严重性的自适应数据自适应ASR辅助说话者说话者
7. The MIT mobile device speaker verification corpus: data collection and preliminary experiments [O] . Ram H. Woo, Alex Park, Timothy J. Hazen 2006

机译：麻省理工学院移动设备发言人验证语料库：数据收集和初步实验
8. Mixer Corpus of Multilingual, Multichannel Speaker Recognition Data [R] . Cieri, C., Campbell, J. P., Nakasone, H., 2004

机译：多语言，多声道说话人识别数据的混音器语料库

TED-LIUM 3: Twice as Much Data and Corpus Repartition for Experiments on Speaker Adaptation

摘要

著录项

相似文献

相关主题

期刊订阅