Latent Words Recurrent Neural Network Language Models for Automatic Speech Recognition

Ryo MASUMURA; Taichi ASAMI; Takanobu OBA; Sumitaka SAKAUCHI; Akinori ITO

首页> 外文期刊>IEICE transactions on information and systems >Latent Words Recurrent Neural Network Language Models for Automatic Speech Recognition

【24h】

Latent Words Recurrent Neural Network Language Models for Automatic Speech Recognition

机译：潜在词递归神经网络语言模型用于自动语音识别

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper demonstrates latent word recurrent neural network language models (LW-RNN-LMs) for enhancing automatic speech recognition (ASR). LW-RNN-LMs are constructed so as to pick up advantages in both recurrent neural network language models (RNN-LMs) and latent word language models (LW-LMs). The RNN-LMs can capture long-range context information and offer strong performance, and the LW-LMs are robust for out-of-domain tasks based on the latent word space modeling. However, the RNN-LMs cannot explicitly capture hidden relationships behind observed words since a concept of a latent variable space is not present. In addition, the LW-LMs cannot take into account long-range relationships between latent words. Our idea is to combine RNN-LM and LW-LM so as to compensate individual disadvantages. The LW-RNN-LMs can support both a latent variable space modeling as well as LW-LMs and a long-range relationship modeling as well as RNN-LMs at the same time. From the viewpoint of RNN-LMs, LW-RNN-LM can be considered as a soft class RNN-LM with a vast latent variable space. In contrast, from the viewpoint of LW-LMs, LW-RNN-LM can be considered as an LW-LM that uses the RNN structure for latent variable modeling instead of an n-gram structure. This paper also details a parameter inference method and two kinds of implementation methods, an n-gram approximation and a Viterbi approximation, for introducing the LW-LM to ASR. Our experiments show effectiveness of LW-RNN-LMs on a perplexity evaluation for the Penn Treebank corpus and an ASR evaluation for Japanese spontaneous speech tasks.

机译：本文演示了潜在词递归神经网络语言模型（LW-RNN-LM），用于增强自动语音识别（ASR）。 LW-RNN-LM的构造旨在在递归神经网络语言模型（RNN-LM）和潜在词语言模型（LW-LM）中获得优势。 RNN-LM可以捕获远程上下文信息并提供强大的性能，而LW-LM对于基于潜在字空间建模的域外任务非常强大。但是，由于不存在潜在变量空间的概念，RNN-LM无法显式捕获观察到的单词后面的隐藏关系。另外，LW-LM无法考虑潜在词之间的远程关系。我们的想法是将RNN-LM和LW-LM结合起来，以弥补各个缺点。 LW-RNN-LM可以同时支持潜在变量空间建模以及LW-LM和远程关系建模以及RNN-LM。从RNN-LM的角度来看，LW-RNN-LM可被视为具有巨大潜在变量空间的软类RNN-LM。相反，从LW-LM的角度来看，LW-RNN-LM可以看作是使用RNN结构进行潜变量建模而不是n-gram结构的LW-LM。本文还详细介绍了将LW-LM引入ASR的参数推断方法和两种实现方法，即n-gram逼近和Viterbi逼近。我们的实验表明，LW-RNN-LM对Penn Treebank语料库的困惑度评估和对日本自发语音任务的ASR评估具有有效性。

著录项

来源
《IEICE transactions on information and systems》 |2019年第12期|共11页
作者
Ryo MASUMURA; Taichi ASAMI; Takanobu OBA; Sumitaka SAKAUCHI; Akinori ITO;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类无线电电子学、电信技术;
关键词
latent words recurrent neural network language modelsn-gram approximationViterbi approximationautomatic speech recognition;

机译：潜在词递归神经网络语言模型n-gram近似Viterbi近似自动语音识别;

相似文献

外文文献
中文文献
专利

1. Domain Adaptation Based on Mixture of Latent Words Language Models for Automatic Speech Recognition [J] . Ryo MASUMURA, Taichi ASAMI, Takanobu OBA, IEICE transactions on information and systems . 2018,第6期

机译：基于潜在词语言模型混合的领域自适应语音自动识别
2. N-gram Approximation of Latent Words Language Models for Domain Robust Automatic Speech Recognition [J] . Ryo MASUMURA, Taichi ASAMI, Takanobu OBA, IEICE transactions on information and systems . 2016,第10期

机译：领域鲁棒自动语音识别的潜在词语言模型的N语法逼近
3. Artificial Neural Network-Based Speech Recognition Using Dwt Analysis Applied On Isolated Words From Oriental Languages [J] . Bacha Rehmam, Ghulam Abbas, Tufail Muhammad, Malaysian Journal of Computer Science . 2015,第3期

机译：基于Dwt分析的基于人工神经网络的语音识别在东方语言孤立词中的应用
4. Minimum word error training of long short-term memory recurrent neural network language models for speech recognition [C] . Takaaki Hori, Chiori Hori, Shinji Watanabe, IEEE International Conference on Acoustics, Speech and Signal Processing . 2016

机译：长短期记忆递归神经网络语言模型用于语音识别的最小单词错误训练
5. Automatic language identification with recurrent neural networks. [D] . Braun, Jerome J. 1997

机译：利用递归神经网络自动识别语言。
6. Multi-resolution speech analysis for automatic speech recognition using deep neural networks: Experiments on TIMIT [O] . Doroteo T. Toledano, María Pilar Fernández-Gallego, Alicia Lozano-Diez 2012

机译：基于深度神经网络的自动语音识别的多分辨率语音分析：TIMIT实验
7. Latent Words Recurrent Neural Network Language Models for Automatic Speech Recognition [O] . Ryo MASUMURA, Taichi ASAMI, Takanobu OBA, 2019

机译：潜在的自动语音识别复发性神经网络语言模型

Latent Words Recurrent Neural Network Language Models for Automatic Speech Recognition

摘要

著录项

相似文献

相关主题

期刊订阅