Efficient Likelihood Evaluation And Dynamic Gaussian Selection For Hmm-based Speech Recognition

Jun Cai; Ghazi Bouselmi; Yves Laprie; Jean-Paul Haton

首页> 外文期刊>Computer speech and language >Efficient Likelihood Evaluation And Dynamic Gaussian Selection For Hmm-based Speech Recognition

【24h】

Efficient Likelihood Evaluation And Dynamic Gaussian Selection For Hmm-based Speech Recognition

机译：基于Hmm的语音识别的高效似然评估和动态高斯选择

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Lvcsr systems are usually based on continuous density HMMs, which are typically implemented using Gaussian mixture distributions. Such statistical modeling systems tend to operate slower than real-time, largely because of the heavy computational overhead of the likelihood evaluation. The objective of our research is to investigate approximate methods that can substantially reduce the computational cost in likelihood evaluation without obviously degrading the recognition accuracy. In this paper, the most common techniques to speed up the likelihood computation are classified into three categories, namely machine optimization, model optimization, and algorithm optimization. Each category is surveyed and summarized by describing and analyzing the basic ideas of the corresponding techniques. The distribution of the numerical values of Gaussian mixtures within a GMM model are evaluated and analyzed to show that computations of some Gaussians are unnecessary and can thus be eliminated. Two commonly used techniques for likelihood approximation, namely VQ-based Gaussian selection and partial distance elimination, are analyzed in detail. Based on the analyses, a fast likelihood computation approach called dynamic Gaussian selection (DGS) is proposed. DGS approach is a one-pass search technique which generates a dynamic shortlist of Gaussians for each state during the procedure of likelihood computation. In principle, DGS is an extension of both techniques of partial distance elimination and best mixture prediction, and it does not require additional memory for the storage of Gaussian shortlists. DGS algorithm has been implemented by modifying the likelihood computation procedure in HTK 3.4 system. Experimental results on TIMIT and WSJ0 corpora indicate that this approach can speed up the likelihood computation significantly without introducing apparent additional recognition error.

机译：Lvcsr系统通常基于连续密度HMM，通常使用高斯混合分布来实现。这样的统计建模系统往往比实时系统运行更慢，这主要是因为可能性评估的计算量很大。我们研究的目的是研究可以在不明显降低识别准确度的前提下，大幅降低似然评估中的计算成本的近似方法。在本文中，最常见的加速似然计算的技术分为三类，分别是机器优化，模型优化和算法优化。通过描述和分析相应技术的基本思想来调查和总结每个类别。对GMM模型中高斯混合数值的分布进行了评估和分析，表明某些高斯的计算是不必要的，因此可以省去。详细分析了两种常用的似然近似技术，即基于VQ的高斯选择和部分距离消除。在此基础上，提出了一种称为动态高斯选择（DGS）的快速似然计算方法。 DGS方法是一种单遍搜索技术，可在似然计算过程中为每个状态生成动态的高斯清单。原则上，DGS是部分距离消除和最佳混合预测两种技术的扩展，并且不需要额外的内存来存储高斯候选列表。 DGS算法已通过修改HTK 3.4系统中的似然计算程序来实现。 TIMIT和WSJ0语料库上的实验结果表明，该方法可以显着加快似然计算的速度，而不会引入明显的附加识别错误。

著录项

来源
《Computer speech and language》 |2009年第2期|147-164|共18页
作者
Jun Cai; Ghazi Bouselmi; Yves Laprie; Jean-Paul Haton;
展开▼
作者单位

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
gaussian selection; fast likelihood computation; hidden markov models; speech recognition;

机译：高斯选择;快速似然计算;隐马尔可夫模型;语音识别;

相似文献

外文文献
中文文献
专利

1. Maximum likelihood linear transformations for HMM-based speech recognition [J] . M. J. F. Gales Computer Speech and Language . 1998,第2期

机译：基于HMM的语音识别的最大似然线性变换
2. FPGA Implementation of a Pipelined Gaussian Calculation for HMM-Based Large Vocabulary Speech Recognition [J] . Richard Veitch, Louis-Marie Aubert, Roger Woods, International journal of reconfigurable computing . 2011,第1期

机译：基于HMM的大词汇语音识别的流水线高斯计算的FPGA实现
3. FPGA Implementation of a Pipelined Gaussian Calculation for HMM-Based Large Vocabulary Speech Recognition [J] . RichardVeitch, Louis-MarieAubert, RogerWoods, International journal of reconfigurable computing . 2011,第1aaPagea1期

机译：基于HMM的大词汇语音识别的流水线高斯计算的FPGA实现
4. Dynamic Gaussian selection technique for speeding up HMM-based continuous speech recognition [C] . Jun Cai, Bouselmi, G., Personal, Indoor and Mobile Radio Communications,2005 IEEE 16th International Symposium on . 2008

机译：动态高斯选择技术可加快基于HMM的连续语音识别
5. HMM-based non-intrusive speech quality and implementation of Viterbi score distribution and hiddenness based measures to improve the performance of speech recognition [D] . Talwar, Gaurav 2006

机译：基于HMM的非侵入式语音质量以及基于Viterbi分数分布和隐蔽性的措施的实施，以提高语音识别的性能
6. Pattern Recognition Methods and Features Selection for Speech Emotion Recognition System [O] . Pavol Partila, Miroslav Voznak, Jaromir Tovarek 2015

机译：语音情感识别系统的模式识别方法和特征选择
7. FEATURE PRUNING IN LIKELIHOOD EVALUATION OF HMM-BASED SPEECH RECOGNITION [O] . 2008

机译：基于HMM的语音识别相似性评价中的特征修剪。

Efficient Likelihood Evaluation And Dynamic Gaussian Selection For Hmm-based Speech Recognition

摘要

著录项

相似文献

相关主题

期刊订阅