首页> 外文会议>National Conference on Communications >Investigating Target Set Reduction for End-to-End Speech Recognition of Hindi-English Code-Switching Data
【24h】

Investigating Target Set Reduction for End-to-End Speech Recognition of Hindi-English Code-Switching Data

机译:研究目标集缩减以实现印地语-英语代码转换数据的端到端语音识别

获取原文

摘要

End-to-end (E2E) systems are fast replacing the conventional systems in the domain of automatic speech recognition. As the target labels are learned directly from speech data, the E2E systems need a bigger corpus for effective training. In the context of code-switching task, the E2E systems face two challenges: (i) the expansion of the target set due to multiple languages involved, and (ii) the lack of availability of sufficiently large domain-specific corpus. Towards addressing those challenges, we propose an approach for reducing the number of target labels for reliable training of the E2E systems on limited data. The efficacy of the proposed approach has been demonstrated on two prominent architectures, namely CTC-based and attention-based E2E networks. The experimental validations are performed on a recently created Hindi-English code-switching corpus. For contrast purpose, the results for the full target set based E2E system and a hybrid DNN-HMM system are also reported.
机译:端到端(E2E)系统正在自动语音识别领域迅速取代传统系统。由于直接从语音数据中学习目标标签,因此E2E系统需要更大的语料库才能进行有效的训练。在代码转换任务的上下文中,E2E系统面临两个挑战:(i)由于涉及多种语言而导致目标集的扩展,以及(ii)缺乏足够大的特定领​​域语料库的可用性。为了应对这些挑战,我们提出了一种减少目标标签数量的方法,以便在有限的数据上可靠地培训E2E系统。所提出的方法的有效性已经在两个著名的体系结构上得到了证明,即基于CTC和基于注意力的E2E网络。实验验证是在最近创建的印地语-英语代码转换语料库上执行的。为了对比,还报告了基于完整目标集的E2E系统和混合DNN-HMM系统的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号