首页> 外文期刊>Computer speech and language >Automatic detection of pharyngeal fricatives in cleft palate speech using acoustic features based on the vocal tract area spectrum
【24h】

Automatic detection of pharyngeal fricatives in cleft palate speech using acoustic features based on the vocal tract area spectrum

机译:基于声带区域谱的声学特征,自动检测腭裂术语中的临床术语

获取原文
获取原文并翻译 | 示例
           

摘要

The pharyngeal fricative is a typical compensatory articulation disorder in cleft palate speech. It is produced by retracting the root of the tongue to the posterior pharyngeal wall to substitute for the fricatives and affricates produced in the oral cavity. People who use the pharyngeal fricative have difficulties in daily communication. Research on automatic pharyngeal fricative detection can provide aids in diagnosis for speech-language pathologists and clinical doctors. This work proposes a vocal tract area spectrum (VTAS) to represent a vocal tract model using time-varying cascaded pipes. Four acoustic features based on the VTAS (the centroid and spread (CS), peak linear deviation (PLD), relative-normal entropy (RNE), mean of the ratios' statistics (MRS)) are proposed to evaluate the differences between pharyngeal fricatives and normal speech. The CS feature is proposed to evaluate the overall shape of the vocal tract to detect whether there are abnormal gestures or movements of the articulators in speech production. The PLD and RNE features focus on the variation and complexity of each vocal tube's area during the whole pronunciation process. The MRS feature is proposed to describe the continuity of the vocal tract. To evaluate the effectiveness of these four features, pharyngeal fricative detection experiments are conducted using a pharyngeal fricative dataset. This dataset contains 1246 speech samples spoken by 50 cleft palate patients and 50 normal speakers, covering all types of initial consonants in which the pharyngeal fricative usually occurs. The detection accuracy of the pharyngeal fricative using the CS, PLD, RNE and MRS feature ranges from 80.66% to 90.21%. When using the proposed CS +PLD+RNE+MRS feature, an accuracy of 95.18% can be achieved on the pharyngeal fricative dataset.
机译:咽部的咽部嗜好目是腭裂腭裂中的典型补偿性紊乱。通过将舌根的根部缩回到后咽壁来替代口腔中产生的摩擦和递质来制备。使用咽部的人们在日常沟通方面都有困难。自动咽部嗜热检测的研究可以提供诊断语言病理学家和临床医生的辅助。这项工作提出了一种声音道区域光谱(VTA),用于使用时变级联管来表示声带模型。提出了基于VTA的四个声学特征(质心和扩频(CS),峰值线性偏差(PLD),相对正常熵(RNE),比率统计(MRS)的平均值),以评估咽部疗效之间的差异和正常的演讲。提出了CS特征来评估声道的整体形状,以检测语音生产中是否存在铰接器的异常手势或移动。 PLD和RNE的特点侧重于整个发音过程中每个声带区域的变化和复杂性。提出了MRS特征来描述声道的连续性。为了评估这四种特征的有效性,使用咽部摩擦数据集进行咽部摩擦检测实验。该数据集包含由50名腭裂患者和50名普通扬声器进行的1246个语音样本,覆盖所有类型的初始辅音,其中通常发生咽部的咽部。使用CS,PLD,RNE和MRS特征的咽部嗜肥的检测准确性范围为80.66%至90.21%。使用所提出的CS + PLD + RNE + MRS特征时,可以在咽部FRICATIAL数据集上实现95.18%的精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号