Speech driven photo realistic facial animation based on an articulatory DBN model and AAM features

Dongmei Jiang; Yong Zhao; Hichem Sahli; Yanning Zhang

首页> 外文期刊>Multimedia Tools and Applications >Speech driven photo realistic facial animation based on an articulatory DBN model and AAM features

【24h】

Speech driven photo realistic facial animation based on an articulatory DBN model and AAM features

机译：基于发音DBN模型和AAM功能的语音驱动的照片逼真的面部动画

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper presents a photo realistic facial animation synthesis approach based on an audio visual articulatory dynamic Bayesian network model (AF_AVDBN), in which the maximum asynchronies between the articulatory features, such as lips, tongue and glottis/-velum, can be controlled. Perceptual Linear Prediction (PLP) features from audio speech, as well as active appearance model (AAM) features from face images of an audio visual continuous speech database, are adopted to train the AF_AVDBN model parameters. Based on the trained model, given an input audio speech, the optimal AAM visual features are estimated via a maximum likelihood estimation (MLE) criterion, which are then used to construct face images for the animation. In our experiments, facial animations are synthesized for 20 continuous audio speech sentences, using the proposed AF_AVDBN model, as well as the state-of-art methods, being the audio visual state synchronous DBN model (SS_DBN) implementing a multi-stream Hidden Markov Model, and the state asynchronous DBN model (SA_DBN). Objective evaluations on the learned AAM features show that much more accurate visual features can be learned from the AF_AVDBN model. Subjective evaluations show that the synthesized facial animations using AF_AVDBN are better than those using the state based SA_DBN and SS_DBN models, in the overall naturalness and matching accuracy of the mouth movements to the speech content.

机译：本文提出了一种基于视听发音动态贝叶斯网络模型（AF_AVDBN）的照片逼真的面部动画合成方法，其中可以控制发音特征（如嘴唇，舌头和声门/-绒毛）之间的最大异步性。音频语音的感知线性预测（PLP）特征以及视听连续语音数据库的面部图像的主动外观模型（AAM）特征被用来训练AF_AVDBN模型参数。基于训练后的模型，给定输入音频语音，通过最大似然估计（MLE）准则估计最佳AAM视觉特征，然后将其用于构建动画的面部图像。在我们的实验中，使用拟议的AF_AVDBN模型以及最新方法，即实现多流隐藏Markov的视听状态同步DBN模型（SS_DBN），针对20个连续音频语音句子合成了面部动画模型，以及状态异步DBN模型（SA_DBN）。对学习的AAM特征的客观评估表明，可以从AF_AVDBN模型中学习到更多准确的视觉特征。主观评估表明，使用AF_AVDBN合成的面部动画要比使用基于状态的SA_DBN和SS_DBN模型的面部动画在嘴部运动与语音内容的整体自然度和匹配精度方面更好。

著录项

来源
《Multimedia Tools and Applications》 |2014年第1期|397-415|共19页
作者
Dongmei Jiang; Yong Zhao; Hichem Sahli; Yanning Zhang;
展开▼
作者单位

School of Computer Science, Northwestern Polytechnical University, Xi'an 710072, People's Republic of China,Shaanxi Provincial Key Laboratory of Speech and Image Information Processing, Shaanxi, China;

School of Computer Science, Northwestern Polytechnical University, Xi'an 710072, People's Republic of China,Shaanxi Provincial Key Laboratory of Speech and Image Information Processing, Shaanxi, China;

Electronics & Informatics Department (ETRO), Vrije Universiteit Brussel, Brussels, Belgium,Interuniversity Microelectronics Center (IMEC), Leuven, Belgium;

School of Computer Science, Northwestern Polytechnical University, Xi'an 710072, People's Republic of China,Shaanxi Provincial Key Laboratory of Speech and Image Information Processing, Shaanxi, China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Facial animation; AF_AVDBN; Asynchrony; AAM;

机译：面部动画;AF_AVDBN;异步美国汽车协会;

相似文献

外文文献
中文文献
专利

1. Realistic Speech-Driven Facial Animation with GANs [J] . International Journal of Computer Vision . 2020,第5期

机译：与GANS的现实演讲驱动的面部动画
2. Speech-driven facial animation with realistic dynamics [J] . Gutierrez-Osuna R., Kakumanu P.K., Esposito A., IEEE transactions on multimedia . 2005,第1期

机译：具有逼真的动态的语音驱动面部动画
3. Realistic Mouth-Synching for Speech-Driven Talking Face Using Articulatory Modelling [J] . Lei Xie, Zhi-Qiang Liu IEEE transactions on multimedia . 2007,第期

机译：使用发音模型对语音驱动的说话人脸进行逼真的嘴部同步
4. Photo-Realistic Mouth Animation Based on an Asynchronous Articulatory DBN Model for Continuous Speech [C] . He Zhang, Dongmei Jiang, Peng Wu, Asia-Pacific Signal and Information Processing Association Annual Summit and Conference . 2011

机译：基于异步发音DBN模型的连续语音的逼真的口部动画
5. Facial animation system with realistic eye movement based on a cognitive model for virtual agents. [D] . Lee, Sooha Park. 2002

机译：基于虚拟代理认知模型的具有逼真的眼动功能的面部动画系统。
6. Automatic speech recognition using articulatory features from subject-independent acoustic-to-articulatory inversion [O] . Prasanta Kumar Ghosh, Shrikanth Narayanan -1

机译：使用从独立于受试者的声学到发音反转的发音特征进行自动语音识别
7. Speech-driven Facial Animation with Realistic Dynamics [O] . R. Gutierrez-osuna, P. K. Kakumanu, Student Member, 2005

机译：具有逼真动态的语音驱动面部动画

Speech driven photo realistic facial animation based on an articulatory DBN model and AAM features

摘要

著录项

相似文献

相关主题

期刊订阅