Comparison of Several Acoustic Modeling Techniques for Speech Emotion Recognition

Imen Trabelsi; Med Salim Bouhlel

首页> 外文期刊>International journal of synthetic emotions >Comparison of Several Acoustic Modeling Techniques for Speech Emotion Recognition

【24h】

Comparison of Several Acoustic Modeling Techniques for Speech Emotion Recognition

机译：语音情感识别中几种声学建模技术的比较

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Automatic Speech Emotion Recognition (SER) is a current research topic in the field of Human Computer Interaction (HCI) with a wide range of applications. The purpose of speech emotion recognition system is to automatically classify speaker's utterances into different emotional states such as disgust, boredom, sadness, neutral, and happiness. The speech samples in this paper are from the Berlin emotional database. Mel Frequency cepstrum coefficients (MFCC), Linear prediction coefficients (LPC), linear prediction cepstrum coefficients (LPCC), Perceptual Linear Prediction (PLP) and Relative Spectral Perceptual Linear Prediction (Rasta-PLP) features are used to characterize the emotional utterances using a combination between Gaussian mixture models (GMM) and Support Vector Machines (SVM) based on the Kullback-Leibler Divergence Kernel. In this study, the effect of feature type and its dimension are comparatively investigated. The best results are obtained with 12-coefficient MFCC. Utilizing the proposed features a recognition rate of 84% has been achieved which is close to the performance of humans on this database.

机译：自动语音情感识别（SER）是人机交互（HCI）领域中的一个当前研究主题，具有广泛的应用。语音情感识别系统的目的是自动将说话者的话语分为不同的情感状态，例如厌恶，无聊，悲伤，中立和幸福。本文中的语音样本来自柏林情感数据库。梅尔频率倒谱系数（MFCC），线性预测系数（LPC），线性预测倒谱系数（LPCC），感知线性预测（PLP）和相对频谱感知线性预测（Rasta-PLP）功能用于通过以下方式表征情绪话语高斯混合模型（GMM）和基于Kullback-Leibler发散核的支持向量机（SVM）的组合。在这项研究中，对特征类型及其尺寸的影响进行了比较研究。使用12系数MFCC可获得最佳结果。利用提出的功能，已经达到了84％的识别率，接近人类在该数据库上的表现。

著录项

来源
《International journal of synthetic emotions》 |2016年第1期|58-68|共11页
作者
Imen Trabelsi; Med Salim Bouhlel;
展开▼
作者单位

Sciences and Technologies of Image and Telecommunications (SETIT), University of Sfax, Tunisia;

Sciences and Technologies of Image and Telecommunications (SETIT), University of Sfax, Tunisia;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Emotion; Formants; Kullback-Leibler; Linear Prediction Cepstrum Coefficients; Linear Prediction Coefficients; Mel Frequency Cepstrum Coefficients; Perceptual Linear Prediction; Relative Spectral Perceptual Linear Prediction;

机译：情感;共振峰Kullback-Leibler;线性预测倒谱系数;线性预测系数;梅尔频率倒谱系数;感知线性预测;相对光谱感知线性预测;

相似文献

外文文献
中文文献
专利

1. Implementation and Comparison of Speech Emotion Recognition System Using Gaussian Mixture Model (GMM) and K- Nearest Neighbor (K-NN) Techniques [J] . Rahul B. Lanjewar, Swarup Mathurkar, Nilesh Patel Procedia Computer Science . 2015,第1期

机译：使用高斯混合模型（GMM）和K-最近邻（K-NN）技术的语音情感识别系统的实现和比较
2. Recognition of Emotions in Mexican Spanish Speech: An Approach Based on Acoustic Modelling of Emotion-Specific Vowels [J] . Santiago-OmarCaballero-Morales ScientificWorldJournal . 2013,第3期

机译：墨西哥西班牙语演讲的情绪：一种基于情感特定元音声学建模的方法
3. Modeling the Temporal Evolution of Acoustic Parameters for Speech Emotion Recognition [J] . Ntalampiras S. Affective Computing, IEEE Transactions on . 2012,第1期

机译：建模用于语音情感识别的声学参数的时间演变
4. Ensemble of Machine Learning and Acoustic Segment Model Techniques for Speech Emotion and Autism Spectrum Disorders Recognition [C] . Hung-yi Lee, Ting-yao Hu, How Jing, Conference of the International Speech Communication Association . 2013

机译：用于语音情感和自闭症谱紊乱的机器学习和声学段模型技术的集合
5. Estimation and modeling techniques for speech recognition [D] . Jevtic, Nikola 2005

机译：语音识别的估计和建模技术
6. Recognition of Emotions in Mexican Spanish Speech: An Approach Based on Acoustic Modelling of Emotion-Specific Vowels [O] . Santiago-Omar Caballero-Morales 2013

机译：墨西哥西班牙语语音中的情绪识别：一种基于情绪特定元音声学模型的方法
7. Implementation and Comparison of Speech Emotion Recognition System Using Gaussian Mixture Model (GMM) and K- Nearest Neighbor (K-NN) Techniques [O] . Lanjewar Rahul B., Mathurkar Swarup, Patel Nilesh 2015

机译：使用高斯混合模型（GMM）和K-最近邻（K-NN）技术的语音情感识别系统的实现和比较
8. Modeling and Classification of Acoustic Transients by Speech Recognition Techniques. [R] . Woodard, J. P. 1989

机译：基于语音识别技术的声学瞬态建模与分类。

Comparison of Several Acoustic Modeling Techniques for Speech Emotion Recognition

摘要

著录项

相似文献

相关主题

期刊订阅