基于扩展N元文法模型的快速语言模型预测算法

单煜翔; 陈谐; 史永哲; 刘加

首页> 中文期刊> 《自动化学报》 >基于扩展N元文法模型的快速语言模型预测算法

基于扩展N元文法模型的快速语言模型预测算法

开具论文收录证明 >>

期刊封面封底目录下载 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

For a dynamic network based large vocabulary continuous speech recognizer, this paper proposes a fast language model (LM) look-ahead method using extended iV-gram model. The extended N-gram model unifies the representations and score computations of the LM and the LM look-ahead tree, and thus greatly simplifies the decoder implementation and improves the LM look-ahead speed significantly, which makes higher-order LM look-ahead possible. The extended N-gram model is generated off-line before decoding starts. The generation procedure makes use- of sparse-ness of backing-off N-gram models for efficient look-ahead score computation, and uses word-end node pushing and score quantitation to compact the model's storage space. Experiments showed that with the same character error rate, the proposed method speeded up the overall recognition speed by a factor of 5 ~ 9 than the traditional dynamic programming method which computes LM look-ahead scores on-line during the decoding process, and that using higher-order LM look-ahead algorithm can achieve a faster decoding speed and better accuracy than using the lower-order look-ahead ones.%针对基于动态解码网络的大词汇量连续语音识别器,本文提出了一种采用扩展N元文法模型进行快速语言模型(Language model,LM)预测的方法.扩展N元文法模型统一了语言模型和语言模型预测树的表示与分数计算方法,从而大大简化了解码器的实现,极大地提升了语言模型预测的速度,使得高阶语言模型预测成为可能.扩展N元文法模型在解码之前离线生成,生成过程利用了N元文法的稀疏性加速计算过程,并采用了词尾节点前推和分数量化的方法压缩模型存储空间大小.实验表明,相比于采用动态规划在解码过程中实时计算语言模型预测分数的传统方法、本文提出的方法在相同的字错误率下使得整个识别系统识别速率提升了5～9倍,并且采用高阶语言模型预测可获得比低阶预测更优的解码速度与精度.

著录项

来源
《自动化学报》 |2012年第10期|1618-1626|共9页
作者
单煜翔; 陈谐; 史永哲; 刘加;
展开▼
作者单位

清华大学电子工程系清华信息科学与技术国家实验室北京100084;

清华大学电子工程系清华信息科学与技术国家实验室北京100084;

清华大学电子工程系清华信息科学与技术国家实验室北京100084;

清华大学电子工程系清华信息科学与技术国家实验室北京100084;

展开▼
原文格式 PDF
正文语种 chi
中图分类
关键词
语音识别; 语言模型预测; N元文法模型; 解码;

相似文献

中文文献
外文文献
专利

1. 一种面向语音识别的三元文法语言模型 [J] . 文茂平 ,李雪涛 ,杨鉴 . 昆明理工大学学报：理工版 . 2005,第z1期
2. 一种改进的汉语N元文法统计语言模型 [J] . 田斌 ,田红心 ,易克初 . 西安电子科技大学学报（自然科学版） . 2000,第001期
3. 基于扩展生成语言模型的图像自动标注方法 [J] . 王梅 ,周向东 ,张军旗 . 软件学报 . 2008,第009期
4. 基于二元文法模型的汉语句子相似度计算 [J] . 郜炎峰 ,王硕宁 . 中国科技信息 . 2016,第013期
5. 基于二元组合文法的歧义消解模型 [J] . 张燕 ,万建成 ,杨潇 . 计算机工程与科学 . 2008,第009期
6. 一种扩展的汉语统计二元文法语言模型 [C] . 田斌 ,易克初 . 第九届全国信号处理学术年会 . 1999
7. 基于查询词依赖性的查询扩展语言模型 [A] . 石艳杰 . 2014

基于扩展N元文法模型的快速语言模型预测算法

摘要

著录项

相似文献

相关主题

期刊订阅