Discriminative Disfluency Modeling for Spontaneous Speech Recognition

机译：自发性语音识别的区分流度建模

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Most automatic speech recognizers (ASRs) have concentrated on read speech, which is different from speech with the presence of disfluencies. These ASRs cannot handle the speech with a high rate of disfluencies such as filled pauses, repetition, repairs, false starts, and silence pauses in actual spontaneous speech or dialogues. In this paper, we focus on the modeling of the filled pauses "uh" and "um." The filled pauses contain the characteristics of nasal and lengthening, and the acoustic parameters for these characteristics are analyzed and adopted for disfluency modeling. A Gaussian mixture model (GMM), trained by a discriminative training algorithm that minimizes the recognition error, is proposed. A transition probability density function is defined from the GMM and used to weight the transition probability between the boundaries of fluency and disfluency models in the one-stage algorithm. Experimental result shows that the proposed method yields an improvement rate of 27.3% for disfluency compared to the baseline system.

机译：大多数自动语音识别器（ASR）都将注意力集中在阅读语音上，这与出现歧义的语音不同。这些ASR无法处理大量的语音干扰，例如，在实际的自发语音或对话中出现了充实的暂停，重复，修复，错误的开始以及静音暂停。在本文中，我们专注于填充的停顿“ uh”和“ um”的建模。填充的停顿包含鼻音和拉长音的特征，分析了这些特征的声学参数，并将其用于水流模型。提出了一种采用判别训练算法训练的高斯混合模型（GMM），该算法将识别误差降至最低。从GMM定义了转移概率密度函数，该函数在一级算法中用于加权流利模型和非流利模型的边界之间的转移概率。实验结果表明，与基线系统相比，所提出的方法对流散性的改善率为27.3％。

著录项

来源
《European Conference on Speech Communication and Technology v.3; 20010903-20010907; Aalborg; DK》|2001年|P.1955-1958|共4页
会议地点 Aalborg(DK);Aalborg(DK)
作者
Chung-Hsien Wu; Gwo-Lang Yan;
展开▼
作者单位

Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan, R.O.C.;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类传播理论;
关键词

相似文献

外文文献
中文文献
专利

1. Speech act modeling and verification of spontaneous speech with disfluency in a spoken dialogue system [J] . Chung-Hsien Wu, Gwo-Lang Yan IEEE Transactions on Speech and Audio Proceessing . 2005,第3期

机译：语音行为建模和口语对话系统中具有自发性的自发语音验证
2. Acoustic Feature Analysis and Discriminative Modeling of Filled Pauses for Spontaneous Speech Recognition [J] . CHUNG-HSIEN WU, GWO-LANG YAN Journal of VLSI signal processing . 2004,第2a3期

机译：自发语音识别的填充暂停的声学特征分析和判别建模
3. Coping with disfluencies in spontaneous speech recognition: Acoustic detection and linguistic context manipulation [J] . Frederik Stouten, Jacques Duchateau, Jean-Pierre Martens, Speech Communication . 2006,第11期

机译：应对自发语音识别中的不满：声音检测和语言上下文操纵
4. Discriminative Disfluency Modeling for Spontaneous Speech Recognition [C] . Chung-Hsien Wu, Gwo-Lang Yan European conference on speech communication and technology . 2001

机译：自发性语音识别的鉴别变性建模
5. Discriminative training of language models for speech recognition . [D] . Magdin, Vladimir. 2010

机译：语音识别语言模型的判别训练。
6. Words from spontaneous conversational speech can be recognized with human-like accuracy by an error-driven learning algorithm that discriminates between meanings straight from smart acoustic features bypassing the phoneme as recognition unit [O] . Denis Arnold, Fabian Tomaschek, Konstantin Sering, -1

机译：通过错误驱动的学习算法可以区分自发会话语音中的单词其准确性与人类类似可以从智能声学特征中区分出含义而绕过音素作为识别单元
7. Speech Act Modeling and Verification of Spontaneous Speech with Disfluency in a Spoken Dialogue System [O] . Chung-hsien Wu, Senior Member, Gwo-lang Yan 2015

机译：口语对话系统中不自然言语的言语行为建模与验证

Discriminative Disfluency Modeling for Spontaneous Speech Recognition

摘要

著录项

相似文献

相关主题

期刊订阅