...
首页> 外文期刊>Applied Acoustics >Environmental sound classification with dilated convolutions
【24h】

Environmental sound classification with dilated convolutions

机译:具有扩张卷积的环境声分类

获取原文
获取原文并翻译 | 示例
           

摘要

In sound information retrieval (SIR) area, environmental sound classification (ESC) emerges as a new issue, which aims at classifying environments by analysing the complex features extracted from the various sound data. As one of the most efficient feature extraction methods, convolution neural networks (CNN) has made its success in speech and music signal processing, and in particular, CNN with pooling has worked effectively in classifying environmental and urban sound sources. However, pooling causes information loss. In this paper, dilated CNN, being introduced to ESC problem, achieves better results than that of CNN with max-pooling and other state-of-the-art approaches. At the same time, we explore the effect of different dilation rate and the number of layers of dilated convolution to the experimental results, and find that expanding the number of covered frames or enlarging the dilation rate will make the accuracy reduce. That may be the sound signal has short-term stability, the size of the overlay frame seriously affects the feature extraction of the sound signal, and there is an inherent "gridding" in the dilation model conjunction defect. (C) 2018 Elsevier Ltd. All rights reserved.
机译:在声音信息检索(SIR)领域,环境声音分类(ESC)作为一个新问题出现,其目的是通过分析从各种声音数据中提取的复杂特征来对环境进行分类。卷积神经网络(CNN)作为最有效的特征提取方法之一,已经在语音和音乐信号处理中取得了成功,尤其是具有合并功能的CNN在分类环境和城市声源方面有效。但是,合并会导致信息丢失。在本文中,将扩张的CNN引入ESC问题,其效果要比采用最大池化和其他最新方法的CNN更好。同时,我们探索了不同的扩张率和扩张卷积层数对实验结果的影响,发现扩大覆盖帧数或扩大扩张率会使精度降低。那可能是声音信号具有短期稳定性,覆盖帧的大小严重影响了声音信号的特征提取,并且在扩散模型联合缺陷中存在固有的“抓握”。 (C)2018 Elsevier Ltd.保留所有权利。

著录项

  • 来源
    《Applied Acoustics》 |2019年第5期|123-132|共10页
  • 作者单位

    Shanxi Univ, Inst Big Data Sci & Ind, Taiyuan 030006, Shanxi, Peoples R China|Shanxi Univ, Sch Comp & Informat Technol, Taiyuan 030006, Shanxi, Peoples R China;

    Shanxi Univ, Inst Big Data Sci & Ind, Taiyuan 030006, Shanxi, Peoples R China|Shanxi Univ, Minist Educ, Key Lab Computat Intelligence & Chinese Informat, Taiyuan 030006, Shanxi, Peoples R China|Shanxi Univ, Sch Comp & Informat Technol, Taiyuan 030006, Shanxi, Peoples R China;

    Shanxi Univ, Inst Big Data Sci & Ind, Taiyuan 030006, Shanxi, Peoples R China|Shanxi Univ, Minist Educ, Key Lab Computat Intelligence & Chinese Informat, Taiyuan 030006, Shanxi, Peoples R China|Shanxi Univ, Sch Comp & Informat Technol, Taiyuan 030006, Shanxi, Peoples R China;

    Shanxi Univ, Inst Big Data Sci & Ind, Taiyuan 030006, Shanxi, Peoples R China|Shanxi Univ, Sch Comp & Informat Technol, Taiyuan 030006, Shanxi, Peoples R China;

    Shanxi Univ, Inst Big Data Sci & Ind, Taiyuan 030006, Shanxi, Peoples R China|Shanxi Univ, Minist Educ, Key Lab Computat Intelligence & Chinese Informat, Taiyuan 030006, Shanxi, Peoples R China|Shanxi Univ, Sch Comp & Informat Technol, Taiyuan 030006, Shanxi, Peoples R China;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Sound information retrieval; Environmental sound classification; Dilated convolutions;

    机译:声音信息检索;环境声音分类;扩散卷积;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号