...
首页> 外文期刊>電子情報通信学会技術研究報告. 音声. Speech >A spectrogram-patch-input DNN model for detection and classification of acoustic events in speech overlapping scenarios
【24h】

A spectrogram-patch-input DNN model for detection and classification of acoustic events in speech overlapping scenarios

机译:频谱图-补丁输入DNN模型,用于语音重叠场景中的声音事件检测和分类

获取原文
获取原文并翻译 | 示例
           

摘要

This paper presents an acoustic event detection and classification method that learns features from spectrogram patches (i.e. concatenation of a certain number of consecutive spectrum frames) in an unsupervised manner, and integrates effortlessly within the deep neural network framework. Most AED use-cases happen in scenarios where speech overlaps with acoustic events, and while derived features (e.g. MFCCs, Mel-filter-banks) have traditionally characterized well the spectrum of speech, they are too dense and centered on specific frequencies to be used with non-speech tasks. Results show that the proposed model based on spectrogram-patch out-performs those based on derived features, as well as previous AED works.
机译:本文提出了一种声音事件检测和分类方法,该方法以无监督的方式从频谱图补丁中学习特征(即连接一定数量的连续频谱帧),并毫不费力地集成在深度神经网络框架中。大多数AED用例都发生在语音与声音事件重叠的场景中,尽管派生功能(例如MFCC,Mel滤波器组)传统上已经很好地描述了语音频谱,但它们过于密集并且集中在特定频率上,无法使用与非语音任务。结果表明,所提出的基于频谱图补丁的模型优于基于派生特征的模型,以及先前的AED工作。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号