首页> 外文会议>European Conference on Speech Communication and Technology v.3; 20010903-20010907; Aalborg; DK >Discriminative Disfluency Modeling for Spontaneous Speech Recognition
【24h】

Discriminative Disfluency Modeling for Spontaneous Speech Recognition

机译:自发性语音识别的区分流度建模

获取原文
获取原文并翻译 | 示例

摘要

Most automatic speech recognizers (ASRs) have concentrated on read speech, which is different from speech with the presence of disfluencies. These ASRs cannot handle the speech with a high rate of disfluencies such as filled pauses, repetition, repairs, false starts, and silence pauses in actual spontaneous speech or dialogues. In this paper, we focus on the modeling of the filled pauses "uh" and "um." The filled pauses contain the characteristics of nasal and lengthening, and the acoustic parameters for these characteristics are analyzed and adopted for disfluency modeling. A Gaussian mixture model (GMM), trained by a discriminative training algorithm that minimizes the recognition error, is proposed. A transition probability density function is defined from the GMM and used to weight the transition probability between the boundaries of fluency and disfluency models in the one-stage algorithm. Experimental result shows that the proposed method yields an improvement rate of 27.3% for disfluency compared to the baseline system.
机译:大多数自动语音识别器(ASR)都将注意力集中在阅读语音上,这与出现歧义的语音不同。这些ASR无法处理大量的语音干扰,例如,在实际的自发语音或对话中出现了充实的暂停,重复,修复,错误的开始以及静音暂停。在本文中,我们专注于填充的停顿“ uh”和“ um”的建模。填充的停顿包含鼻音和拉长音的特征,分析了这些特征的声学参数,并将其用于水流模型。提出了一种采用判别训练算法训练的高斯混合模型(GMM),该算法将识别误差降至最低。从GMM定义了转移概率密度函数,该函数在一级算法中用于加权流利模型和非流利模型的边界之间的转移概率。实验结果表明,与基线系统相比,所提出的方法对流散性的改善率为27.3%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号