Automatic Measurement of Voice Onset Time and Prevoicing using Recurrent Neural Networks

机译：使用反复性神经网络自动测量语音发作时间和前进性

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Voice onset time (VOT) is defined as the time difference between the onset of the burst and the onset of voicing. When voicing begins preceding the burst, the stop is called prevoiced, and the VOT is negative. When voicing begins following the burst the VOT is positive. While most of the work on automatic measurement of VOT has focused on positive VOT mostly evident in American English, in many languages the VOT can be negative. We propose an algorithm that estimates if the stop is prevoiced, and measures either positive or negative VOT, respectively. More specifically, the input to the algorithm is a speech segment of an arbitrary length containing a single stop consonant, and the output is the time of the burst onset, the duration of the burst, and the time of the prevoicing onset with a confidence. Manually labeled data is used to train a recurrent neural network that can model the dynamic temporal behavior of the input signal, and outputs the events' onset and duration. Results suggest that the proposed algorithm is superior to the current state-of-the-art both in terms of the VOT measurement and in terms of prevoicing detection.

机译：语音发起时间（VOT）被定义为突发的开始与发起的发起的时差。当发起声音在突发之前，所谓的停止被称为，并且票数是否定的。当发出声音后，突发突发时，票数是积极的。虽然大部分工作的VOT自动测量都集中在美国英语中大多是明显的，但在许多语言中，票子可能是消极的。我们提出了一种估计停止的算法，分别估计停止，并分别测量正面或负直票。更具体地，算法的输入是包含单个停止辅音的任意长度的语音段，并且输出是突发开始的时间，突发的持续时间，以及具有置信度的前进的发起的时间。手动标记的数据用于训练可以模拟输入信号的动态时间行为的经常性神经网络，并输出事件的开始和持续时间。结果表明，在速度测量和前进检测方面，所提出的算法优于目前的现有技术。

著录项

来源
《Annual Conference of the International Speech Communication Association》|2016年|p3106-3887|共4页
会议地点
作者
Yossi Adi; Joseph Keshet; Olga Dmitrieva; Matt Goldrick;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TB95-53;
关键词

相似文献

外文文献
中文文献
专利

1. Time-resolved reconstruction of flow field around a circular cylinder by recurrent neural networks based on non-time-resolved particle image velocimetry measurements [J] . Experiments in Fluids: Experimental Methods and Their Applications to Fluid Flow . 2020,第4期

机译：基于非时析粒子图像速度测量的经常性神经网络，经常发生的神经网络围绕圆筒周围的流场的时间分辨重建
2. Automatic measurement of voice onset time using discriminative structured prediction [J] . Sonderegger M., Keshet J. The Journal of the Acoustical Society of America . 2012,第6期

机译：使用判别式结构化预测自动测量语音开始时间
3. Nonlinear system identification for predictive control using continuous time recurrent neural networks and automatic differentiation [J] . Al Seyab RK, Cao Y Journal of Process Control . 2008,第6期

机译：连续时间递归神经网络和自动微分的预测控制非线性系统识别
4. Automatic Measurement of Voice Onset Time and Prevoicing using Recurrent Neural Networks [C] . Yossi Adi, Joseph Keshet, Olga Dmitrieva, Annual Conference of the International Speech Communication Association . 2016

机译：使用反复性神经网络自动测量语音发作时间和前进性
5. Gene expression temporal patterns classification with hierarchical Bayesian neural networks and time lagged recurrent neural networks. [D] . Liang, Yulan. 2003

机译：利用分层贝叶斯神经网络和时滞递归神经网络对基因表达时间模式进行分类。
6. A state space approach for piecewise-linear recurrent neural networks for identifying computational dynamics from neural measurements [O] . Daniel Durstewitz 2018

机译：分段线性递归神经网络的状态空间方法，用于从神经测量中识别计算动力学
7. A Low-Latency, Real-Time-Capable Singing Voice Detection Method with LSTM Recurrent Neural Networks [O] . Böck Sebastian, Lehner Bernhard, Widmer Gerhard 2015

机译：具有LSTM递归神经网络的低延迟，实时通话声音检测方法

Automatic Measurement of Voice Onset Time and Prevoicing using Recurrent Neural Networks

摘要

著录项

相似文献

相关主题

期刊订阅