Decision-tree probability modeling for HMM speech recognition.

机译：HMM语音识别的决策树概率建模。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Hidden Markov models (HMMs) are widely regarded as the most robust technique for speaker-independent, connected word recognition. The performance of a given HMM depends largely on the fidelity of the underlying acoustic model, which must estimate the probability of an acoustic observation given a HMM state. Conventional acoustic models include discrete methods where the feature space is discretized, typically by nearest-neighbor vector quantization, and continuous models, where the acoustic probabilities are estimated by Gaussian mixtures or neural-net techniques.;Preliminary results of a tree-based HMM system shows recognition performance approaching continuous models at a computational cost comparable to discrete models. In addition, near-real-time talker-adaptation experiments show promising results.;A new type of acoustic model is presented here, where the underlying feature space is partitioned by a decision tree. Acoustic probabilities may then be estimated either by Baum-Welch methods or Viterbi training, exactly as in conventional discrete methods. Though the tree is used as a vector quantizer, it has significant advantages over conventional discrete models: (1) Trees are extremely fast for classification. Not only is this practical for real-time systems, but the feature space may be quantized with much finer resolution, giving a high-resolution non-parametric model of the underlying pdfs with all the computational advantages of a discrete model. (2) Trees handle both high-dimensional and discrete spaces gracefully. This allows context-dependent acoustic models by concatenating time-adjacent input vectors, and even a "recurrent tree" model by using the delayed output of the tree as an input feature. (3) The relative importance of the individual feature dimensions may be discerned from the tree structure. This allows discrimination between feature types to find those that best represent the underlying speech information. (4) Given a decision tree, new probability estimates may be easily found from Viterbi-labeled data in linear time, rather than the iterative training required by other models. This allows a practical method of speaker adaptation by re-estimating probabilities as new data becomes available.

机译：隐马尔可夫模型（HMM）被广泛认为是与说话者无关的关联单词识别的最可靠技术。给定HMM的性能在很大程度上取决于基础声学模型的保真度，该模型必须估计在HMM状态下进行声学观察的概率。常规声学模型包括离散方法和连续模型，在离散方法中特征空间通常通过最近邻矢量量化离散化;在连续模型中，声学概率是通过高斯混合或神经网络技术估计的;基于树的HMM系统的初步结果显示了与离散模型相当的计算性能，逼近连续模型的识别性能。此外，近实时的说话人适应性实验也显示出了令人鼓舞的结果。在此提出了一种新型的声学模型，其中，潜在的特征空间由决策树划分。然后，可以像传统的离散方法一样，通过Baum-Welch方法或Viterbi训练来估计声学概率。尽管树被用作矢量量化器，但与传统的离散模型相比，它具有显着的优势：（1）树的分类速度非常快。这不仅适用于实时系统，而且可以用更精细的分辨率对特征空间进行量化，从而为基础pdf提供了高分辨率的非参数模型，并具有离散模型的所有计算优势。（2）树可以优雅地处理高维空间和离散空间。通过连接时间相邻的输入向量，可以实现上下文相关的声学模型，甚至可以通过使用树的延迟输出作为输入特征来实现“递归树”模型。（3）可以从树结构中识别出各个特征尺寸的相对重要性。这允许在特征类型之间进行区分以找到最能代表基础语音信息的特征类型。（4）给定一个决策树，可以很容易地在线性时间内从维特比标记的数据中找到新的概率估计值，而不是其他模型所需的迭代训练。当新数据可用时，这可以通过重新估计概率来实现说话者自适应的实用方法。

著录项

作者
Foote, Jonathan Trumbull.;
展开▼
作者单位

Brown University.;

展开▼
授予单位 Brown University.;
学科 Engineering Electronics and Electrical.;Computer Science.
学位 Ph.D.
年度 1994
页码 102 p.
总页数 102
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Decision-Tree Backing-off in HMM-Based Speech Synthesis [J] . Shunsuke KATAOKA, Nobuaki MIZUTANI, KeiicM TOKUDAt, 電子情報通信学会技術研究報告. 音声. Speech . 2003,第264期

机译：基于HMM的语音合成中的决策树后退
2. Decision-Tree Backing-off in HMM-Based Speech Synthesis [J] . Shunsuke KATAOKA, Nobuaki MIZUTANI, KeiicM TOKUDAt, 電子情報通信学会技術研究報告. 音声. Speech . 2003,第264期

机译：基于HMM的语音合成中的决策树备份
3. Improved Speech Presence Probabilities Using HMM-Based Inference, With Applications to Speech Enhancement and ASR [J] . Borgstrom B. J., Alwan A. Selected Topics in Signal Processing, IEEE Journal of . 2010,第99期

机译：使用基于HMM的推理改进语音存在概率，并将其应用于语音增强和ASR
4. DECISION-TREE BACKING-OFF IN HMM-BASED SPEECH SYNTHESIS [C] . Shunsuke Kataoka, Nobuaki Mizutani, Keiichi Tokuda, International Conference on Spoken Language Processing; 20041004-08; Jeju(KR) . 2004

机译：基于HMM的语音合成中的决策树退避
5. Modeling articulatory dynamics using HMM techniques for automatic speech recognition. [D] . Erler, Kevin J. 1994

机译：使用HMM技术对发音动力学进行建模以实现自动语音识别。
6. HMM-ModE – Improved classification using profile hidden Markov models by optimising the discrimination threshold and modifying emission probabilities with negative training sequences [O] . Prashant K Srivastava, Dhwani K Desai, Soumyadeep Nandi, 2007

机译：HMM-ModE –使用轮廓隐式马尔可夫模型改进分类方法是优化区分阈值并使用负训练序列修改发射概率
7. Discrete MMI Probability Models for HMM Speech Recognition [O] . J. T. Foote 1995

机译：用于Hmm语音识别的离散mmI概率模型
8. Improved HMM Models for High Performance Speech Recognition. [R] . Austin, S., Barry, C., Chow, Y., 1989

机译：改进的Hmm模型用于高性能语音识别。

Decision-tree probability modeling for HMM speech recognition.

摘要

著录项

相似文献

相关主题

期刊订阅