...
首页> 外文期刊>Procedia - Social and Behavioral Sciences >Automatic Recognition of Repetitions in Stuttered Speech: Using End-Point Detection and Dynamic Time Warping
【24h】

Automatic Recognition of Repetitions in Stuttered Speech: Using End-Point Detection and Dynamic Time Warping

机译:自动识别口吃语音中的重复:使用端点检测和动态时间规整

获取原文
           

摘要

This study proposes a methodology for recognizing repetitions in stuttered speech. First, the recorded speech is parameterized by extracting six acoustic features, including volume, zero crossing rate, spectral entropy, high-order derivatives, VH curve, and VE curve. Second, the speech is segmented using the technique of end-point detection according (EPD) to the threshold of VH curve. Third, the features of the segmented speech are processed by dynamic time warping (DTW) to identify similar patterns in neighbouring segments. The proposed method was verified using the artificial stuttering samples of Mandarin Chinese. Ten male subjects were asked to imitate stuttering by speak out 39 predefined repetition settings. These settings are planned by considering three Mandarin Phonetic Symbols ([t], [k], [t‘]) and three kinds of repetitions (part-word repetition, whole-word repetition, multi-syllable word repetition). The experimental results indicate that EPD using VH curve is capable to slice the repetition in artificial stuttered speech. Comparing the results for recognizing the phoneme and single syllable words, there is no significant difference for the threshold of DTW. The performance of DTW in recognizing repetitions had high accuracy of 83%. Therefore, the proposed method combining EPD and DTW is feasible for automatic recognition of repetitions in stuttered speech. However, more real stuttered speech samples are still needed to verify and improve the proposed method.
机译:这项研究提出了一种识别口吃重复的方法。首先,通过提取六个声学特征(包括音量,零交叉率,频谱熵,高阶导数,VH曲线和VE曲线)来对录制的语音进行参数化。其次,使用根据(EPD)的端点检测技术将语音分割为VH曲线的阈值。第三,通过动态时间规整(DTW)处理分段语音的特征,以识别相邻分段中的相似模式。人工口吃的普通话样本验证了该方法的有效性。要求十名男性受试者通过说出39种预定义的重复设置来模仿口吃。通过考虑三个普通话音标([t],[k],[t’])和三种重复(部分单词重复,全单词重复,多音节单词重复)来计划这些设置。实验结果表明,使用VH曲线的EPD能够对人工口吃语音中的重复进行切片。比较识别音素和单个音节单词的结果,DTW阈值没有显着差异。 DTW在识别重复项方面的性能具有83%的高精度。因此,提出的将EPD和DTW相结合的方法对于自动识别口吃语音中的重复是可行的。但是,仍然需要更多真实的口吃语音样本来验证和改进所提出的方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号