首页> 外国专利> Singing voice separation with deep U-Net convolutional networks

Singing voice separation with deep U-Net convolutional networks

机译：用深U-Net卷积网络唱歌语音分离

页面导航

摘要
著录项
相似文献

摘要

A system, method and computer product for estimating a component of a provided audio signal. The method comprises converting the provided audio signal to an image, processing the image with a neural network trained to estimate one of vocal content and instrumental content, and storing a spectral mask output from the neural network as a result of the image being processed by the neural network. The neural network is a U-Net. The method also comprises providing the spectral mask to a client media playback device, which applies the spectral mask to a spectrogram of the provided audio signal, to provide a masked spectrogram. The media playback device also transforms the masked spectrogram to an audio signal, and plays back that audio signal via an output user interface.

机译：用于估计提供的音频信号的组件的系统，方法和计算机产品。该方法包括将所提供的音频信号转换为图像，用培训的神经网络处理图像以估计声乐内容和乐器内容之一，并且由于所处理的图像而从神经网络存储来自神经网络的频谱掩模输出。神经网络。神经网络是U-Net。该方法还包括向客户媒体回放设备提供频谱掩模，该客户媒体回放设备将光谱掩模施加到所提供的音频信号的频谱图，以提供屏蔽频谱图。媒体回放设备还将屏蔽频谱图转换为音频信号，并通过输出用户界面返回该音频信号。

著录项

公开/公告号US10991385B2

专利类型
公开/公告日2021-04-27

原文格式PDF
申请/专利权人 SPOTIFY AB;
展开▼

申请/专利号US201816165498
发明设计人 ANDREAS SIMON THORE JANSSON;ANGUS WILLIAM SACKFIELD;CHING CHUAN SUNG;DAVID RUBINSTEIN;
展开▼

申请日2018-10-19
分类号G10L25/81;G10L21/06;G10L25/18;G10L15/16;G06N3/08;
国家 US
入库时间 2022-08-24 18:23:17

相似文献

专利
外文文献
中文文献