【24h】

A Brazilian Speech Database

机译:巴西语音数据库

获取原文

摘要

This work introduces a Brazilian Speech Database (BrSD), a novel dataset freely available created to support the development of speech-based recognition tasks. As far as we know, this is the first Portuguese language based database with these characteristics created and made available to the research community. We also describe experiments accomplished on BrSD exploring its different possibilities of classification tasks, i.e., age group and gender classification. We use four well-known acoustic features extracted directly from the audio signal and one texture-based feature extracted from a visual representation of the audio signal, the spectrogram. We considered three different classification scenarios: each feature individually, early fusion of the features, and late fusion of the features. Experiments were conducted using Support Vector Machine (SVM) and Multi-layer Perceptron (MLP) classifiers. The obtained results showed that SVM classifier achieved the best recognition rates both in early and late fusion scenarios. The best recognition rates achieved were 91.25%, 88.75%, and 80.25% for gender, age group, and age-gender classification tasks, respectively.
机译:这项工作介绍了一个巴西语音数据库(BRSD),这是一个自由的新型数据集,以支持基于语音的识别任务的开发。据我们所知,这是第一个基于葡萄牙语语言的数据库,这些数据库具有创建的这些特征,并为研究界提供。我们还描述了BRSD完成的实验,探索其不同的分类任务可能性,即年龄组和性别分类。我们使用直接从音频信号提取的四个众所周知的声学特征和从音频信号的视觉表示提取的一个基于纹理的特征,频谱图。我们考虑了三种不同的分类方案:每个功能单独,早期融合的功能,以及功能的晚期融合。使用支撑载体机(SVM)和多层Perceptron(MLP)分类器进行实验。所得结果表明,SVM分类器在早期和后期融合情景中实现了最佳识别率。达到的最佳识别率分别为51.25%,88.75%和80.25%,分别为性别,年龄组和年龄 - 性别分类任务。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号