A fast and memory-efficient N-gram language model lookup method for large vocabulary continuous speech recognition

Xiaolong Li; Yunxin Zhao

首页> 外文期刊>Computer speech and language >A fast and memory-efficient N-gram language model lookup method for large vocabulary continuous speech recognition

【24h】

A fast and memory-efficient N-gram language model lookup method for large vocabulary continuous speech recognition

机译：用于大词汇量连续语音识别的快速且高效存储的N元语法模型查找方法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Recently, minimum perfect hashing (MPH)-based language model (LM) lookup methods have been proposed for fast access of TV-gram LM scores in lexical-tree based LVCSR (large vocabulary continuous speech recognition) decoding. Methods of node-based LM cache and LM context pre-computing (LMCP) have also been proposed to combine with MPH for further reduction of LM lookup time. Although these methods are effective, LM lookup still takes a large share of overall decoding time when trigram LM lookahead (LMLA) is used for lower word error rate than unigram or bigram LMLAs. Besides computation time, memory cost is also an important performance aspect of decoding systems. Most speedup methods for LM lookup obtain higher speed at the cost of increased memory demand, which makes system performance unpredictable when running on computers with smaller memory capacities. In this paper, an order-preserving LM context pre-computing (OPCP) method is proposed to achieve both fast speed and small memory cost in LM lookup. By reducing hashing operations through order-preserving access of LM scores, OPCP cuts down LM lookup time effectively. In the meantime, OPCP significantly reduces memory cost because of reduced size of hashing keys and the need for only last word index of each N-gram in LM storage. Experimental results are reported on two LVCSR tasks (Wall Street Journal 20K and Switchboard 33K) with three sizes of trigram LMs (small, medium, large). In comparison with above-mentioned existing methods, OPCP reduced LM lookup time from about 30-80% of total decoding time to about 8-14%, without any increase of word error rate. Except for the small LM, the total memory cost of OPCP for LM lookup and storage was about the same or less than the original N-gram LM storage, much less than the compared methods. The time and memory savings in LM lookup by using OPCP became more pronounced with the increase of LM size.

机译：最近，已经提出了基于最小完美散列（MPH）的语言模型（LM）查找方法，用于在基于词汇树的LVCSR（大词汇量连续语音识别）解码中快速访问TV-gram LM分数。还提出了基于节点的LM缓存和LM上下文预计算（LMCP）的方法，以与MPH结合使用，以进一步减少LM查找时间。尽管这些方法有效，但是当使用Trigram LM超前查找（LMLA）来实现比unigram或bigram LMLA更低的字错误率时，LM查找仍会占用总体解码时间的很大一部分。除了计算时间之外，存储器成本也是解码系统的重要性能方面。大多数用于LM查找的加速方法以增加的内存需求为代价获得了更高的速度，这使得在具有较小内存容量的计算机上运行时系统性能无法预测。本文提出了一种保持顺序的LM上下文预计算（OPCP）方法，以实现LM查找的快速性和较小的存储成本。通过保持对LM分数的顺序访问来减少哈希操作，OPCP有效地减少了LM查找时间。同时，由于减小了哈希键的大小，并且仅需要LM存储中每个N-gram的最后一个单词索引，因此OPCP大大降低了内存成本。在两个LVCSR任务（《华尔街日报》 20K和《配电盘》 33K）上报告了实验结果，它们具有三种尺寸的Trigram LM（小，中，大）。与上述现有方法相比，OPCP将LM查找时间从总解码时间的大约30-80％减少到大约8-14％，而不会增加误码率。除了小型LM外，用于LM查找和存储的OPCP的总存储成本与原始N-gram LM存储大致相同或更少，远低于比较方法。随着LM大小的增加，通过使用OPCP在LM查找中节省的时间和内存变得更加明显。

著录项

来源
《Computer speech and language》 |2007年第1期|p.1-25|共25页
作者
Xiaolong Li; Yunxin Zhao;
展开▼
作者单位

Department of Computer Science, University of Missouri -Columbia, 301 Engineering Building West, Columbia, MO 65211, USA;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. An improved two-stage mixed language model approach for handling out-of-vocabulary words in large vocabulary continuous speech recognition [J] . Bert Reveil, Kris Demuynck, Jean-Pierre Martens Computer speech and language . 2014,第1期

机译：一种改进的两阶段混合语言模型方法，用于处理大词汇量连续语音识别中的词汇外单词
2. A unified language model for large vocabulary continuous speech recognition of Turkish [J] . Ebru Ansoy, Helin Dutagaci, Levent M. Arslan Signal processing . 2006,第10期

机译：土耳其语大词汇量连续语音识别的统一语言模型
3. Continuous Mandarin speech recognition for Chinese language with large vocabulary based on segmental probability model [J] . Shen J.-L. IEE Proceedings. Part K . 1998,第5期

机译：基于分段概率模型的大词汇量汉语连续汉语语音识别
4. Fast and accurate recognition of very-large-vocabulary continuous Mandarin speech for Chinese language with improved segmental probability modeling [C] . Jia-Lin Shen, Lin-Shan Lee . 1996

机译：改进的分段概率模型可快速准确地识别超大型词汇的汉语连续汉语语音
5. Modeling lexical tones for Mandarin large vocabulary continuous speech recognition. [D] . Lei, Xin. 2006

机译：为普通话大词汇量连续语音识别建模词汇声调。
6. Using Morphological Data in Language Modeling for Serbian Large Vocabulary Speech Recognition [O] . Edvin Pakoci, Branislav Popović, Darko Pekar 2019

机译：在塞尔维亚大型词汇语音识别的语言建模中使用形态学数据
7. A New Word Clustering Method for Building N-Gram Language Models in Continuous Speech Recognition Systems [O] . Mohammad Bahrani, Hossein Sameti, Nazila Hafezi, 2013

机译：连续语音识别系统中构建N-gram语言模型的新词聚类方法

A fast and memory-efficient N-gram language model lookup method for large vocabulary continuous speech recognition

摘要

著录项

相似文献

相关主题

期刊订阅