...
首页> 外文期刊>IEICE transactions on information and systems >Latent Words Recurrent Neural Network Language Models for Automatic Speech Recognition
【24h】

Latent Words Recurrent Neural Network Language Models for Automatic Speech Recognition

机译:潜在词递归神经网络语言模型用于自动语音识别

获取原文
           

摘要

This paper demonstrates latent word recurrent neural network language models (LW-RNN-LMs) for enhancing automatic speech recognition (ASR). LW-RNN-LMs are constructed so as to pick up advantages in both recurrent neural network language models (RNN-LMs) and latent word language models (LW-LMs). The RNN-LMs can capture long-range context information and offer strong performance, and the LW-LMs are robust for out-of-domain tasks based on the latent word space modeling. However, the RNN-LMs cannot explicitly capture hidden relationships behind observed words since a concept of a latent variable space is not present. In addition, the LW-LMs cannot take into account long-range relationships between latent words. Our idea is to combine RNN-LM and LW-LM so as to compensate individual disadvantages. The LW-RNN-LMs can support both a latent variable space modeling as well as LW-LMs and a long-range relationship modeling as well as RNN-LMs at the same time. From the viewpoint of RNN-LMs, LW-RNN-LM can be considered as a soft class RNN-LM with a vast latent variable space. In contrast, from the viewpoint of LW-LMs, LW-RNN-LM can be considered as an LW-LM that uses the RNN structure for latent variable modeling instead of an n-gram structure. This paper also details a parameter inference method and two kinds of implementation methods, an n-gram approximation and a Viterbi approximation, for introducing the LW-LM to ASR. Our experiments show effectiveness of LW-RNN-LMs on a perplexity evaluation for the Penn Treebank corpus and an ASR evaluation for Japanese spontaneous speech tasks.
机译:本文演示了潜在词递归神经网络语言模型(LW-RNN-LM),用于增强自动语音识别(ASR)。 LW-RNN-LM的构造旨在在递归神经网络语言模型(RNN-LM)和潜在词语言模型(LW-LM)中获得优势。 RNN-LM可以捕获远程上下文信息并提供强大的性能,而LW-LM对于基于潜在字空间建模的域外任务非常强大。但是,由于不存在潜在变量空间的概念,RNN-LM无法显式捕获观察到的单词后面的隐藏关系。另外,LW-LM无法考虑潜在词之间的远程关系。我们的想法是将RNN-LM和LW-LM结合起来,以弥补各个缺点。 LW-RNN-LM可以同时支持潜在变量空间建模以及LW-LM和远程关系建模以及RNN-LM。从RNN-LM的角度来看,LW-RNN-LM可被视为具有巨大潜在变量空间的软类RNN-LM。相反,从LW-LM的角度来看,LW-RNN-LM可以看作是使用RNN结构进行潜变量建模而不是n-gram结构的LW-LM。本文还详细介绍了将LW-LM引入ASR的参数推断方法和两种实现方法,即n-gram逼近和Viterbi逼近。我们的实验表明,LW-RNN-LM对Penn Treebank语料库的困惑度评估和对日本自发语音任务的ASR评估具有有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号