首页> 外文学位 >Decision-tree probability modeling for HMM speech recognition.
【24h】

Decision-tree probability modeling for HMM speech recognition.

机译:HMM语音识别的决策树概率建模。

获取原文
获取原文并翻译 | 示例

摘要

Hidden Markov models (HMMs) are widely regarded as the most robust technique for speaker-independent, connected word recognition. The performance of a given HMM depends largely on the fidelity of the underlying acoustic model, which must estimate the probability of an acoustic observation given a HMM state. Conventional acoustic models include discrete methods where the feature space is discretized, typically by nearest-neighbor vector quantization, and continuous models, where the acoustic probabilities are estimated by Gaussian mixtures or neural-net techniques.;Preliminary results of a tree-based HMM system shows recognition performance approaching continuous models at a computational cost comparable to discrete models. In addition, near-real-time talker-adaptation experiments show promising results.;A new type of acoustic model is presented here, where the underlying feature space is partitioned by a decision tree. Acoustic probabilities may then be estimated either by Baum-Welch methods or Viterbi training, exactly as in conventional discrete methods. Though the tree is used as a vector quantizer, it has significant advantages over conventional discrete models: (1) Trees are extremely fast for classification. Not only is this practical for real-time systems, but the feature space may be quantized with much finer resolution, giving a high-resolution non-parametric model of the underlying pdfs with all the computational advantages of a discrete model. (2) Trees handle both high-dimensional and discrete spaces gracefully. This allows context-dependent acoustic models by concatenating time-adjacent input vectors, and even a "recurrent tree" model by using the delayed output of the tree as an input feature. (3) The relative importance of the individual feature dimensions may be discerned from the tree structure. This allows discrimination between feature types to find those that best represent the underlying speech information. (4) Given a decision tree, new probability estimates may be easily found from Viterbi-labeled data in linear time, rather than the iterative training required by other models. This allows a practical method of speaker adaptation by re-estimating probabilities as new data becomes available.
机译:隐马尔可夫模型(HMM)被广泛认为是与说话者无关的关联单词识别的最可靠技术。给定HMM的性能在很大程度上取决于基础声学模型的保真度,该模型必须估计在HMM状态下进行声学观察的概率。常规声学模型包括离散方法和连续模型,在离散方法中特征空间通常通过最近邻矢量量化离散化;在连续模型中,声学概率是通过高斯混合或神经网络技术估计的;基于树的HMM系统的初步结果显示了与离散模型相当的计算性能,逼近连续模型的识别性能。此外,近实时的说话人适应性实验也显示出了令人鼓舞的结果。在此提出了一种新型的声学模型,其中,潜在的特征空间由决策树划分。然后,可以像传统的离散方法一样,通过Baum-Welch方法或Viterbi训练来估计声学概率。尽管树被用作矢量量化器,但与传统的离散模型相比,它具有显着的优势:(1)树的分类速度非常快。这不仅适用于实时系统,而且可以用更精细的分辨率对特征空间进行量化,从而为基础pdf提供了高分辨率的非参数模型,并具有离散模型的所有计算优势。 (2)树可以优雅地处理高维空间和离散空间。通过连接时间相邻的输入向量,可以实现上下文相关的声学模型,甚至可以通过使用树的延迟输出作为输入特征来实现“递归树”模型。 (3)可以从树结构中识别出各个特征尺寸的相对重要性。这允许在特征类型之间进行区分以找到最能代表基础语音信息的特征类型。 (4)给定一个决策树,可以很容易地在线性时间内从维特比标记的数据中找到新的概率估计值,而不是其他模型所需的迭代训练。当新数据可用时,这可以通过重新估计概率来实现说话者自适应的实用方法。

著录项

  • 作者

    Foote, Jonathan Trumbull.;

  • 作者单位

    Brown University.;

  • 授予单位 Brown University.;
  • 学科 Engineering Electronics and Electrical.;Computer Science.
  • 学位 Ph.D.
  • 年度 1994
  • 页码 102 p.
  • 总页数 102
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号