Statistical conversion of silent articulation into audible speech using full-covariance HMM

Thomas Hueber; Gerard Bailly

首页> 外文期刊>Computer speech and language >Statistical conversion of silent articulation into audible speech using full-covariance HMM

【24h】

Statistical conversion of silent articulation into audible speech using full-covariance HMM

机译：使用全协方差HMM将无声发音统计转换为可听语音

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This article investigates the use of statistical mapping techniques for the conversion of articulatory movements into audible speech with no restriction on the vocabulary, in the context of a silent speech interface driven by ultrasound and video imaging. As a baseline, we first evaluated the GMM-based mapping considering dynamic features, proposed by Toda et al. (2007) for voice conversion. Then, we proposed a 'phonetically-informed' version of this technique, based on full-covariance HMM. This approach aims (1) at modeling explicitly the articulatory timing for each phonetic class, and (2) at exploiting linguistic knowledge to regularize the problem of silent speech conversion. Both techniques were compared on continuous speech, for two French speakers (one male, one female). For modal speech, the HMM-based technique showed a lower spectral distortion (objective evaluation). However, perceptual tests (transcription and XAB discrimination tests) showed a better intelligibility of the GMM-based technique, probably related to its less fluctuant quality. For silent speech, a perceptual identification test revealed a better segmental intelligibility for the HMM-based technique on consonants.

机译：本文研究了在超声和视频成像驱动的无声语音界面的情况下，将统计映射技术用于将发音运动转换为可听语音而不受词汇限制的情况。作为基线，我们首先评估了Toda等人提出的考虑动态特征的基于GMM的映射。（2007）进行语音转换。然后，我们基于全协方差HMM提出了该技术的“语音通知”版本。这种方法的目的是（1）明确建模每个语音类别的发音时间，以及（2）利用语言知识来规范无声语音转换问题。在两名讲法语的人（一位男，一位女）的连续讲话中比较了这两种技术。对于模态语音，基于HMM的技术显示出较低的频谱失真（客观评估）。但是，知觉测试（转录和XAB区分测试）显示出基于GMM的技术更好的清晰度，这可能与其波动质量较小有关。对于无声语音，感知识别测试显示基于HMM的辅音技术具有更好的分段清晰度。

著录项

来源
《Computer speech and language》 |2016年第3期|274-293|共20页
作者
Thomas Hueber; Gerard Bailly;
展开▼
作者单位

Univ. Grenoble Alpes, GIPSA-Lab, F-38000 Grenoble, France,CNRS, GIPSA-Lab. F-38000 Grenoble, France,GIPSA-lab, 11 rue des Mathematiques, 38402 Saint Martin d'Heres, France;

Univ. Grenoble Alpes, GIPSA-Lab, F-38000 Grenoble, France,CNRS, GIPSA-Lab. F-38000 Grenoble, France;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Silent speech interface; GMM; HMM; Ultrasound; Articulatory-acoustic mapping;

机译：静音语音界面;GMM;HMM;超声波关节声学映射;

相似文献

外文文献
中文文献
专利

1. Using statistical decision theory to predict speech intelligibility. III. Effect of audibility on speech recognition sensitivity [J] . Musch H, Buus S The Journal of the Acoustical Society of America . 2004,第4期

机译：使用统计决策理论预测语音清晰度。三，可听度对语音识别灵敏度的影响
2. HMM-based speech synthesis with various degrees of articulation: A perceptual study [J] . Benjamin Picart, Thomas Drugman, Thierry Dutoit Neurocomputing . 2014,第maya20期

机译：基于HMM的语音合成，具有不同的清晰度：一项感知研究
3. Voice conversion using duration-embedded bi-HMMs for expressive speech synthesis [J] . Chung-Hsien Wu, Chi-Chun Hsia, Te-Hsien Liu, IEEE transactions on audio, speech and language processing . 2006,第4期

机译：使用持续时间嵌入的双HMM进行语音转换以实现表达性语音合成
4. A Digital Signal Processor Implementation of Silent/Electrolaryngeal Speech Enhancement based on Real-Time Statistical Voice Conversion [C] . Takuto Moriguchi, Tomoki Toda, Motoaki Sano, Conference of the International Speech Communication Association . 2013

机译：一种基于实时统计语音转换的无声/电磁语音增强的数字信号处理器实现
5. Statistics of nonlinear averaging spectral estimators and a novel distance measure for HMMs with application to speech quality estimation. [D] . Liang, Hongkang. 2005

机译：非线性平均频谱估计器的统计数据和HMM的新型距离测度，并应用于语音质量估计。
6. Maximizing Audibility and Speech Recognition with Non-Linear Frequency Compression by Estimating Audible Bandwidth [O] . Ryan W. McCreery, Marc A. Brennan, Brenda Hoover, -1

机译：最大化可听度和语音识别与非线性频率压缩通过估计声音带宽
7. Voice restoration after laryngectomy based on magnetic sensing of articulator movement and statistical articulation-to-speech conversion [O] . Gonzalez, J.A., Cheah, L.A., Gilbert, J.M., 2017

机译：基于咬合器运动的磁感应和统计清晰度 - 语音转换的喉切除术后的语音恢复

Statistical conversion of silent articulation into audible speech using full-covariance HMM

摘要

著录项

相似文献

相关主题

期刊订阅