首页> 外文会议>International Symposium on Chinese Spoken Language Processing >CTC regularized model adaptation for improving LSTM RNN based multi-accent Mandarin speech recognition
【24h】

CTC regularized model adaptation for improving LSTM RNN based multi-accent Mandarin speech recognition

机译:CTC规范化模型自适应,可改善基于LSTM RNN的多口音普通话语音识别

获取原文

摘要

This paper proposes a novel regularized adaptation method for long short term memory (LSTM) recurrent neural network (RNN) based acoustic model trained with connectionist temporal classification (CTC) loss function (LSTM-RNN-CTC) to improve the performance of multi-accent Mandarin speech recognition task. In general, directly adjusting the network parameters with a small adaptation set may lead to over-fitting. In order to avoid this problem, we add a regularization term to the original training criterion. It forces the conditional probability distribution over initial and final (I/F) sequences estimated from the adapted model to be close to the accent independent (AI) model. Meanwhile, hidden layers of LSTM RNN should not be adjusted, but only the accent-specific output layer needs to be fine-tuned using this adaptation method. Experiments on RASC863 and CASIA regional accent speech corpus show that the proposed method obtains obvious improvement when compared with LSTM-RNN-CTC baseline model.
机译:本文提出了一种新的基于长期记忆(LSTM)递归神经网络(RNN)的正则化自适应方法,该方法采用连接主义的时间分类(CTC)损失函数(LSTM-RNN-CTC)进行训练,以提高多音色的性能普通话语音识别任务。通常,以小的适配集直接调整网络参数可能会导致过度拟合。为了避免此问题,我们在原始训练准则中添加了正则化项。它迫使从自适应模型估计的初始和最终(I / F)序列上的条件概率分布接近于重音独立(AI)模型。同时,不应调整LSTM RNN的隐藏层,而仅需要使用此自适应方法微调特定于口音的输出层。 RASC863和CASIA区域口音语料库的实验表明,与LSTM-RNN-CTC基线模型相比,该方法具有明显的改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号