首页> 外文会议>Signal Processing and Communications Applications Conference >Spoken Language Identification with Deep Convolutional Neural Network and Data Augmentation
【24h】

Spoken Language Identification with Deep Convolutional Neural Network and Data Augmentation

机译:具有深度卷积神经网络和数据增强的口语语言识别

获取原文

摘要

In this paper, a spoken language detection system based on deep convolutional neural networks is presented. The neural network model is trained and tested on a speech dataset containing five languages. Speech signals are first converted into mel-spectrogram features and these features are fed into the deep convolutional neural network. Flattened outputs of the deep convolutional network are then fed into a recurrent layer, and a dense layer with softmax activation function is used as an output layer to predict the output language probabilities. This network results in 0.89 F1-score in our test data. We also used a data augmentation method, namely SpecAugment, which increased the F1-score to 0.94.
机译:本文介绍了一种基于深卷积神经网络的口语检测系统。神经网络模型在包含五种语言的语音数据集上培训并测试。语音信号首先转换为熔点分子特征,并且这些特征被馈送到深卷积神经网络中。然后将深度卷积网络的扁平输出送入复制层,并且使用软MAX激活功能的致密层用作输出层以预测输出语言概率。该网络在测试数据中导致0.89 F1分数。我们还使用了数据增强方法,即分类,从而将F1分数增加到0.94。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号