首页> 外文会议>Artificial intelligence: Theories, models and applications >Audio Features Selection for Automatic Height Estimation from Speech
【24h】

Audio Features Selection for Automatic Height Estimation from Speech

机译:从语音自动估计高度的音频功能选择

获取原文
获取原文并翻译 | 示例

摘要

Aiming at the automatic estimation of the height of a person from speech, we investigate the applicability of various subsets of speech features, which were formed on the basis of ranking the relevance and the individual quality of numerous audio features. Specifically, based on the relevance ranking of the large set of openSMILE audio descriptors, we performed selection of subsets with different sizes and evaluated them on the height estimation task. In brief, during the speech parameterization process, every input utterance is converted to a single feature vector, which consists of 6552 parameters. Next, a subset of this feature vector is fed to a support vector machine (SVM)-based regression model, which aims at the straight estimation of the height of an unknown speaker. The experimental evaluation performed on the TIMIT database demonstrated that: (ⅰ) the feature vector composed of the top-50 ranked parameters provides a good trade-off between computational demands and accuracy, and that (ⅱ) the best accuracy, in terms of mean absolute error and root mean square error, is observed for the top-200 subset.
机译:为了自动估计一个人的语音高度,我们研究了语音特征的各个子集的适用性,这些子集是在对众多音频特征的相关性和个性进行排名的基础上形成的。具体来说,基于大量openSMILE音频描述符的相关性排名,我们选择了具有不同大小的子集,并在高度估计任务上对其进行了评估。简而言之,在语音参数化过程中,每个输入语音都转换为单个特征向量,其中包含6552个参数。接下来,将此特征向量的子集馈送到基于支持向量机(SVM)的回归模型,该模型旨在直接估计未知说话者的身高。在TIMIT数据库上进行的实验评估表明:(ⅰ)由排名前​​50位的参数组成的特征向量在计算需求和准确性之间提供了良好的折衷,并且(ⅱ)就均值而言,最佳准确性对于前200个子集,可以观察到绝对误差和均方根误差。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号