FORCED-ALIGNMENT OF THE SUNG ACOUSTIC SIGNAL USING DEEP NEURAL NETS

Dallin Backstrom; Matthew C.Kelley; Benjamin V.Tucker

首页> 外文期刊>Canadian acoustics >FORCED-ALIGNMENT OF THE SUNG ACOUSTIC SIGNAL USING DEEP NEURAL NETS

【24h】

FORCED-ALIGNMENT OF THE SUNG ACOUSTIC SIGNAL USING DEEP NEURAL NETS

机译：使用深神经网的Sung声学信号的强制对准

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Sung speech shows significant acoustic differences from spoken speech. One challenge in analyzing both spoken and sung speech is identifying the individual speech sounds. Forcedalignment systems such as P2FA [1] and the Montreal Forced Aligner [2] have been designed to accomplish this task for spoken speech, however, there is no such tool for sung speech. Previous work used a combination of hidden Markov models and convolutional neural networks on log-Mel filterbanks to segment phones in sung Mandarin opera [3]. We, in turn, trained a deep neural network to extract phone-level information from a sung acoustic signal. The primary objective was to create a model that can take a WAV file containing a target song as the input, and produce time-aligned phonemic labels automatically as output. To measure the performance of our model on these tasks, we primarily measured accuracy on identifying the correct phone label at a given time-step. We also compared the accuracy of our model to other state of the art systems, trained on spoken speech, performing the same task with sung speech.

机译：Sung演讲显示出与口语言论的显着声学差异。分析口语和唱歌语音的一个挑战是识别各个语音声音。诸如P2FA [1]和蒙特利尔强制对齐器等强制管理系统旨在实现这项任务以实现语音的语音，但是，没有唱歌语音的工具。以前的工作使用了隐马尔可夫模型和卷积神经网络的组合在log-mel referbanks上到了Sung Mandarin Opera的段电话[3]。反过来，我们训练了深度神经网络，以从SUNG声学信号中提取电话级信息。主要目标是创建一个模型，可以将包含目标歌曲作为输入的WAV文件，并自动产生时间对齐的音素标签作为输出。为了测量我们对这些任务的模型的性能，我们主要测量在给定时间步骤中识别正确的电话标签的准确性。我们还将我们模型的准确性与其他艺术系统的准确性进行了比较，培训了口语演讲，与Sung语音进行了相同的任务。

著录项

来源
《Canadian acoustics》 |2019年第3期|共2页
作者
Dallin Backstrom; Matthew C.Kelley; Benjamin V.Tucker;
展开▼
作者单位

Department of Linguistics University of Alberta;

Department of Linguistics University of Alberta;

Department of Linguistics University of Alberta;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类振动和噪声的控制及其利用;
关键词
FORCED-ALIGNMENT; SUNG ACOUSTIC SIGNAL; NEURAL NETS;

机译：强制对齐;唱声信号;神经网;

相似文献

外文文献
中文文献
专利

1. FORCED-ALIGNMENT OF THE SUNG ACOUSTIC SIGNAL USING DEEP NEURAL NETS [J] . Dallin Backstrom, Matthew C.Kelley, Benjamin V.Tucker Canadian acoustics . 2019,第3期

机译：使用深神经网的Sung声学信号的强制对准
2. Automatic Classification of Motor Impairment Neural Disorders from EEG Signals Using Deep Convolutional Neural Networks [J] . Vrbancic Grega, Podgorelec Vili Elektronika ir Elektrotechnika . 2018,第4期

机译：利用深卷积神经网络自动分类电机障碍神经障碍的脑电图
3. Acoustic landmarks contain more information about the phone string than other frames for automatic speech recognition with deep neural network acoustic model [J] . He Di, Lim Boon Pang, Yang Xuesong, The Journal of the Acoustical Society of America . 2018,第6aPta1期

机译：声学地标包含与具有深度神经网络声学模型的自动语音识别的其他帧的更多信息
4. Automatic boiler tube leak detection with deep bidirectional LSTM neural networks of acoustic emission signals [C] . Majid G. Ramezani, Mostafa Hasanian, Behnoush Golchinfar, Conference on Sensors and Smart Structures Technologies for Civil, Mechanical, and Aerospace Systems . 2020

机译：利用深双向LSTM神经网络的声发射信号自动检测锅炉管道泄漏
5. Going Deeper with Recurrent Convolutional Neural Networks for Classifying P300 BCI Signals [D] . Maddula, Ramesh Krishna. 2017

机译：利用递归卷积神经网络对P300 BCI信号进行分类
6. Sound Levels Forecasting in an Acoustic Sensor Network Using a Deep Neural Network [O] . Juan M. Navarro, Raquel Martínez-España, Andrés Bueno-Crespo, 2020

机译：使用深度神经网络的声学传感器网络中的声级预测
7. Automatic Classification of Motor Impairment Neural Disorders from EEG Signals Using Deep Convolutional Neural Networks [O] . Grega Vrbancic, Vili Podgorelec 2018

机译：使用深卷积神经网络自动分类来自EEG信号的电动机障碍神经障碍

FORCED-ALIGNMENT OF THE SUNG ACOUSTIC SIGNAL USING DEEP NEURAL NETS

摘要

著录项

相似文献

相关主题

期刊订阅