CTC regularized model adaptation for improving LSTM RNN based multi-accent Mandarin speech recognition

机译：CTC规范化模型自适应，可改善基于LSTM RNN的多口音普通话语音识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper proposes a novel regularized adaptation method for long short term memory (LSTM) recurrent neural network (RNN) based acoustic model trained with connectionist temporal classification (CTC) loss function (LSTM-RNN-CTC) to improve the performance of multi-accent Mandarin speech recognition task. In general, directly adjusting the network parameters with a small adaptation set may lead to over-fitting. In order to avoid this problem, we add a regularization term to the original training criterion. It forces the conditional probability distribution over initial and final (I/F) sequences estimated from the adapted model to be close to the accent independent (AI) model. Meanwhile, hidden layers of LSTM RNN should not be adjusted, but only the accent-specific output layer needs to be fine-tuned using this adaptation method. Experiments on RASC863 and CASIA regional accent speech corpus show that the proposed method obtains obvious improvement when compared with LSTM-RNN-CTC baseline model.

机译：本文提出了一种新的基于长期记忆（LSTM）递归神经网络（RNN）的正则化自适应方法，该方法采用连接主义的时间分类（CTC）损失函数（LSTM-RNN-CTC）进行训练，以提高多音色的性能普通话语音识别任务。通常，以小的适配集直接调整网络参数可能会导致过度拟合。为了避免此问题，我们在原始训练准则中添加了正则化项。它迫使从自适应模型估计的初始和最终（I / F）序列上的条件概率分布接近于重音独立（AI）模型。同时，不应调整LSTM RNN的隐藏层，而仅需要使用此自适应方法微调特定于口音的输出层。 RASC863和CASIA区域口音语料库的实验表明，与LSTM-RNN-CTC基线模型相比，该方法具有明显的改进。

著录项

来源
《International Symposium on Chinese Spoken Language Processing》|2016年|1-5|共5页
会议地点
作者
Jiangyan Yi; Hao Ni; Zhengqi Wen; Bin Liu; Jianhua Tao;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Adaptation models; Hidden Markov models; Speech; Speech recognition; Training; Artificial intelligence; Mathematical model;

机译：适应模型;隐马尔可夫模型;语音;语音识别;训练;人工智能;数学模型;

相似文献

外文文献
中文文献
专利

1. CTC Regularized Model Adaptation for Improving LSTM RNN Based Multi-Accent Mandarin Speech Recognition [J] . Jiangyan Yi, Zhengqi Wen, Jianhua Tao, Journal of signal processing systems for signal, image, and video technology . 2018,第7期

机译：CTC正则化模型自适应，用于改进基于LSTM RNN的多口音普通话语音识别
2. RNN-based prosodic modeling for mandarin speech and its application to speech-to-text conversion [J] . Wern-Jun Wang, Yuan-Fu Liao, Sin-Horng Chen Speech Communication . 2002,第3a4期

机译：基于RNN的普通话韵律模型及其在语音到文本转换中的应用
3. Using Highway Connections to Enable Deep Small-footprint LSTM-RNNs for Speech Recognition [J] . Cheng Gaofeng, Li Xin, Yan Yonghong Chinese Journal of Electronics . 2019,第1期

机译：使用公路连接启用深度较小的LSTM-RNN进行语音识别
4. CTC regularized model adaptation for improving LSTM RNN based multi-accent Mandarin speech recognition [C] . Jiangyan Yi, Hao Ni, Zhengqi Wen, International Symposium on Chinese Spoken Language Processing . 2016

机译：基于LSTM的基于LSTM的多重点普通话语音识别的CTC正常模型适应
5. RNN/LSTM Data Assimilation for the Lorenz Chaotic Models [D] . Vashistha, Harsh Vardhan. 2018

机译：Lorenz混沌模型的RNN / LSTM数据同化
6. Regularized Speaker Adaptation of KL-HMM for Dysarthric Speech Recognition [O] . Myungjong Kim, Younggwan Kim, Joohong Yoo, -1

机译：KL-HMM的正则化说话人适应用于音调异常语音识别
7. LSTM RNN-based Korean Speech Recognition System Using CTC [O] . Donghyun Lee, Minkyu Lim, Hosung Park, 2017

机译：基于LSTM RNN的韩国语音识别系统使用CTC
8. LSTM, GRU, Highway and a Bit of Attention: An Empirical Overview for Language Modeling in Speech Recognition. [R] . Irie, K., Tuske, Z., Alkhouli, T., 2016

机译：LsTm，GRU，公路和一点注意：语音识别中语言建模的经验概述。

CTC regularized model adaptation for improving LSTM RNN based multi-accent Mandarin speech recognition

摘要

著录项

相似文献

相关主题

期刊订阅