首页> 外文会议>International Conference on speech and computer >Recurrent DNNs and Its Ensembles on the TIMIT Phone Recognition Task
【24h】

Recurrent DNNs and Its Ensembles on the TIMIT Phone Recognition Task

机译:递归DNN及其在TIMIT电话识别任务中的集合

获取原文

摘要

In this paper, we have investigated recurrent deep neural networks (DNNs) in combination with regularization techniques as dropout, zoneout, and regularization post-layer. As a benchmark, we chose the TIMIT phone recognition task due to its popularity and broad availability in the community. It also simulates a low-resource scenario that is helpful in minor languages. Also, we prefer the phone recognition task because it is much more sensitive to an acoustic model quality than a large vocabulary continuous speech recognition task. In recent years, recurrent DNNs pushed the error rates in automatic speech recognition down. But, there was no clear winner in proposed architectures. The dropout was used as the regularization technique in most cases, but combination with other regularization techniques together with model ensembles was omitted. However, just an ensemble of recurrent DNNs performed best and achieved an average phone error rate from 10 experiments 14.84% (minimum 14.69%) on core test set that is slightly lower then the best-published PER to date, according to our knowledge. Finally, in contrast of the most papers, we published the open-source scripts to easily replicate the results and to help continue the development.
机译:在本文中,我们研究了循环深度神经网络(DNN),并结合了正则化技术(如辍学,区域划分和正则化后层)。作为基准,我们选择TIMIT电话识别任务是因为它的受欢迎程度和在社区中的广泛可用性。它还模拟了资源匮乏的场景,这对于次要语言很有用。另外,我们更喜欢电话识别任务,因为它比大词汇量连续语音识别任务对声学模型的质量更为敏感。近年来,递归DNN降低了自动语音识别中的错误率。但是,在提议的体系结构中没有明确的赢家。在大多数情况下,该辍学被用作正则化技术,但是省略了与其他正则化技术以及模型集成的组合。但是,据我们所知,只有一组循环DNN表现最佳,并且在核心测试集上进行的10个实验的平均电话错误率达到14.84%(最低14.69%),略低于迄今为止最佳的PER。最后,与大多数论文相反,我们发布了开放源代码脚本,可以轻松地复制结果并帮助继续进行开发。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号