Lightweight and Interpretable Neural Modeling of an Audio Distortion Effect Using Hyperconditioned Differentiable Biquads

机译：使用高音定义可微分的销售的轻量级和可解释的神经建模的音频失真效果

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this work, we propose using differentiable cascaded biquads to model an audio distortion effect. We extend trainable infinite impulse response (IIR) filters to the hyperconditioned case, in which a transformation is learned to directly map external parameters of the distortion effect to its internal filter and gain parameters, along with activations necessary to ensure filter stability. We propose a novel, efficient training scheme of IIR filters by means of a Fourier transform. Our models have significantly fewer parameters and reduced complexity relative to more traditional black-box neural audio effect modeling methodologies using finite impulse response filters. Our smallest, best-performing model adequately models a BOSS MT-2 pedal at 44.1 kHz, using a total of 40 biquads and only 210 parameters. Its model parameters are interpretable, can be related back to the original analog audio circuit, and can even be intuitively altered by machine learning non-specialists after model training. Quantitative and qualitative results illustrate the effectiveness of the proposed method.

机译：在这项工作中，我们建议使用可微分的级联替代店来模拟音频失真效果。我们将培训无限脉冲响应（IIR）滤波器扩展到超级说明性情况，其中学习了转换，以直接将失真效果的外部参数映射到其内部过滤器和增益参数以及确保滤波器稳定所需的激活。我们通过傅里叶变换提出了一种新颖的，高效的IIR过滤器训练方案。我们的模型具有显着较少的参数和相对于使用有限脉冲响应过滤器更加传统的黑盒神经音频效果建模方法的复杂性。我们最小，表现最佳的模型充分展示了44.1 kHz的老板MT-2踏板，共使用40个替代，只有210个参数。其模型参数是可解释的，可以返回原始模拟音频电路，甚至可以通过机器学习非专家在模型训练后直观地改变。定量和定性结果说明了所提出的方法的有效性。

著录项

来源
《IEEE International Conference on Acoustics, Speech and Signal Processing》|2021年|890-894|共5页
会议地点
作者
Shahan Nercessian; Andy Sarroff; Kurt James Werner;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Training; Acoustic distortion; Fourier transforms; Finite impulse response filters; Filtering; IIR filters; Machine learning;

机译：训练;声响失真;傅里叶变换;有限脉冲响应过滤器;过滤;IIR过滤器;机器学习;

相似文献

外文文献
中文文献
专利

1. Effects of aspirin on distortion product fine structure: Interpreted by the two-source model for distortion product otoacoustic emissions generation [J] . Rao A., Long G.R. The Journal of the Acoustical Society of America . 2011,第2期

机译：阿司匹林对畸变产物精细结构的影响：畸变产物耳声发射的两源模型解释
2. Larning algorithms for audio signal enhancement part 1: Neural network implementation for the removal of impulse distortions [J] . Andrzej Czyzewski Journal of the Audio Engineering Society . 1997,第10期

机译：音频信号增强的学习算法，第1部分：消除脉冲失真的神经网络实现
3. LEARNING ALGORITHMS FOR AUDIO SIGNAL ENHANCEMENT .1. NEURAL NETWORK IMPLEMENTATION FOR THE REMOVAL OF IMPULSE DISTORTIONS [J] . Czyzewski A. Journal of the Audio Engineering Society . 1997,第10期

机译：音频信号增强的学习算法1。神经网络消除脉冲失真的实现
4. Blow: a single-scale hyperconditioned flow for non-parallel raw-audio voice conversion [C] . Joan Serra, Santiago Pascual, Carlos Segura Conference on Neural Information Processing Systems . 2020

机译：吹：用于非平行原始音频语音转换的单尺度高音型流
5. iHear - Lightweight machine learning engine with context aware audio recognition model. [D] . Mannava, Guru Teja. 2016

机译：iHear-具有上下文感知音频识别模型的轻量级机器学习引擎。
6. GaborNet Visual Encoding: A Lightweight Region-Based Visual Encoding Model With Good Expressiveness and Biological Interpretability [O] . Yibo Cui, Kai Qiao, Chi Zhang, 2021

机译：Gabornet视觉编码：基于轻质区域的视觉编码模型具有良好的表达性和生物解释性
7. Lightweight and Interpretable Neural Modeling of an Audio Distortion Effect Using Hyperconditioned Differentiable Biquads [O] . Shahan Nercessian, Andy Sarroff, Kurt James Werner 2021

机译：使用超级化可分辨率销售的轻量级和可解释的神经建模的音频失真效果

Lightweight and Interpretable Neural Modeling of an Audio Distortion Effect Using Hyperconditioned Differentiable Biquads

摘要

著录项

相似文献

相关主题

期刊订阅