首页> 外文期刊>IEEE transactions on audio, speech and language processing >Tracking vocal tract resonances using a quantized nonlinear function embeddedin a temporal constraint
【24h】

Tracking vocal tract resonances using a quantized nonlinear function embeddedin a temporal constraint

机译:使用嵌入在时间约束中的量化非线性函数跟踪声道共振

获取原文
获取原文并翻译 | 示例
           

摘要

This paper presents a new technique for high-accuracy tracking of vocal-tract resonances (which coincide with formants for nonnasalized vowels) in natural speech. The technique is based on a discretized nonlinear prediction function, which is embedded in a temporal constraint on the quantized input values over adjacent time frames as the prior knowledge for their temporal behavior. The nonlinear prediction is constructed, based on its analytical form derived in detail in this paper, as a parameter-free, discrete mapping function that approximates the "forward" relationship from the resonance frequencies and bandwidths to the Linear Predictive Coding (LPC) cepstra of real speech. Discretization of the function permits the "inversion" of the function via a search operation. We further introduce the nonlinear-prediction residual, characterized by a multivariate Gaussian vector with trainable mean vectors and covariance matrices, to account for the errors due to the functional approximation. We develop and describe an expectation-maximization (EM)-based algorithm for training the parameters of the residual, and a dynamic programming-based algorithm for resonance tracking. Details of the algorithm implementation for computation speedup are provided. Experimental results are presented which demonstrate the effectiveness of our new paradigm for tracking vocal-tract resonances. In particular, we show the effectiveness of training the prediction-residual parameters in obtaining high-accuracy resonance estimates, especially during consonantal closure.
机译:本文提出了一种在自然语音中高精度跟踪声道共振(与非鼻音元音共振峰一致)的新技术。该技术基于离散的非线性预测函数,该函数嵌入在相邻时间帧上对量化输入值的时间约束中,作为其时间行为的先验知识。非线性预测是基于本文详细推导的分析形式而构建的,它是一种无参数的离散映射函数,该函数近似估计从共振频率和带宽到线性预测编码(LPC)倒谱的“正向”关系。真实的演讲。函数的离散化允许通过搜索操作对函数进行“反转”。我们进一步介绍了非线性预测残差,其特征在于具有可训练的均值向量和协方差矩阵的多元高斯向量,以解决由于函数逼近而引起的误差。我们开发和描述了一种用于训练残差参数的基于期望最大化(EM)的算法,以及一种用于共振跟踪的基于动态编程的算法。提供了用于计算加速的算法实现的详细信息。提出了实验结果,这些结果证明了我们的新范例在跟踪声道共振方面的有效性。特别是,我们展示了训练预测残差参数在获得高精度共振估计中的有效性,尤其是在辅音闭合期间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号