首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >Domain-Adversarial Autoencoder with Attention Based Feature Level Fusion for Speech Emotion Recognition
【24h】

Domain-Adversarial Autoencoder with Attention Based Feature Level Fusion for Speech Emotion Recognition

机译:基于域 - 普发的AutoEncoder基于注意力的特征级融合,用于语音情感识别

获取原文

摘要

Over the past two decades, although speech emotion recognition (SER) has garnered considerable attention, the problem of insufficient training data has been unresolved. A potential solution for this problem is to pre-train a model and transfer knowledge from large amounts of audio data. However, the data used for pre-training and testing originate from different domains, resulting in the latent representations to contain non-affective information. In this paper, we propose a domain-adversarial autoencoder to extract discriminative representations for SER. Through domain-adversarial learning, we can reduce the mismatch between domains while retaining discriminative information for emotion recognition. We also introduce multi-head attention to capture emotion information from different subspaces of input utterances. Experiments on IEMOCAP show that the proposed model outperforms the state-of-the-art systems by improving the unweighted accuracy by 4.15%, thereby demonstrating the effectiveness of the proposed model.
机译:在过去的二十年中,虽然语音情感认可(SER)已经获得了相当大的关注,但训练数据不足的问题已经过分解决。该问题的潜在解决方案是预先训练模型并从大量音频数据传输知识。但是,用于预训练和测试的数据源自不同的域,导致潜在的表示包含非情感信息。在本文中,我们提出了一个域 - 普遍的AutoEncoder,以提取SER的鉴别表现。通过域 - 对抗的学习,我们可以减少域之间的不匹配,同时保留情感识别的判别信息。我们还介绍了多主题注意,以捕捉来自输入话语的不同子空间的情绪信息。 IEMocap实验表明,所提出的模型通过提高了4.15%,通过提高了415%,表明了所提出的模型的有效性,这促进了最先进的系统。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号