Domain-Adversarial Autoencoder with Attention Based Feature Level Fusion for Speech Emotion Recognition

机译：基于域 - 普发的AutoEncoder基于注意力的特征级融合，用于语音情感识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Over the past two decades, although speech emotion recognition (SER) has garnered considerable attention, the problem of insufficient training data has been unresolved. A potential solution for this problem is to pre-train a model and transfer knowledge from large amounts of audio data. However, the data used for pre-training and testing originate from different domains, resulting in the latent representations to contain non-affective information. In this paper, we propose a domain-adversarial autoencoder to extract discriminative representations for SER. Through domain-adversarial learning, we can reduce the mismatch between domains while retaining discriminative information for emotion recognition. We also introduce multi-head attention to capture emotion information from different subspaces of input utterances. Experiments on IEMOCAP show that the proposed model outperforms the state-of-the-art systems by improving the unweighted accuracy by 4.15%, thereby demonstrating the effectiveness of the proposed model.

机译：在过去的二十年中，虽然语音情感认可（SER）已经获得了相当大的关注，但训练数据不足的问题已经过分解决。该问题的潜在解决方案是预先训练模型并从大量音频数据传输知识。但是，用于预训练和测试的数据源自不同的域，导致潜在的表示包含非情感信息。在本文中，我们提出了一个域 - 普遍的AutoEncoder，以提取SER的鉴别表现。通过域 - 对抗的学习，我们可以减少域之间的不匹配，同时保留情感识别的判别信息。我们还介绍了多主题注意，以捕捉来自输入话语的不同子空间的情绪信息。 IEMocap实验表明，所提出的模型通过提高了4.15％，通过提高了415％，表明了所提出的模型的有效性，这促进了最先进的系统。

著录项

来源
《IEEE International Conference on Acoustics, Speech and Signal Processing》|2021年|6314-6318|共5页
会议地点
作者
Yuan Gao; JiaXing Liu; Longbiao Wang; Jianwu Dang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Emotion recognition; Conferences; Training data; Speech recognition; Signal processing; Feature extraction; Data models;

机译：情绪认可;会议;培训数据;语音识别;信号处理;特征提取;数据模型;

相似文献

外文文献
中文文献
专利

1. Attention and Feature Selection for Automatic Speech Emotion Recognition Using Utterance and Syllable-Level Prosodic Features [J] . Ben Alex Starlet, Mary Leena, Babu Ben P. Circuits, systems and signal processing . 2020,第11期

机译：用话语和音节级韵律特征对自动语音情感识别的关注和特征选择
2. A Two-Stage Attention Based Modality Fusion Framework for Multi-Modal Speech Emotion Recognition [J] . Dongni HU, Chengxin CHEN, Pengyuan ZHANG, IEICE transactions on information and systems . 2021,第8期

机译：基于两阶段关注的多模态语音情感识别的模态融合框架
3. A novel dual attention-based BLSTM with hybrid features in speech emotion recognition [J] . Qiupu Chen, Guimin Huang Engineering Applications of Artificial Intelligence . 2021,第Juna期

机译：一种基于新的双重关注的BLSTM，语音情感识别中的混合特征
4. Sparse Autoencoder with Attention Mechanism for Speech Emotion Recognition [C] . Ting-Wei Sun, An-Yeu Andy Wu 2019 IEEE International Conference on Artificial Intelligence Circuits and Systems . 2019

机译：带有注意力机制的稀疏自动编码器用于语音情感识别
5. Novel Frameworks for Attribute-Based Speech Emotion Recognition using Time-continuous Traces and Sentence-Level Annotations [D] . Parthasarathy, Srinivas. 2019

机译：基于属性的语音情感识别的新颖框架使用时间连续迹线和句子级注释
6. Synthetic Aperture Radar Target Recognition with Feature Fusion Based on a Stacked Autoencoder [O] . Miao Kang, Kefeng Ji, Xiangguang Leng, 2017

机译：基于堆叠自动编码器特征融合的合成孔径雷达目标识别
7. Multimodal Approach of Speech Emotion Recognition Using Multi-Level Multi-Head Fusion Attention-Based Recurrent Neural Network [O] . Ngoc-Huynh Ho, Hyung-Jeong Yang, Soo-Hyung Kim, 2020

机译：基于多级多头融合的复发性神经网络的多式联运方法

Domain-Adversarial Autoencoder with Attention Based Feature Level Fusion for Speech Emotion Recognition

摘要

著录项

相似文献

相关主题

期刊订阅