首页> 外文期刊>Multimedia Tools and Applications >Speech driven photo realistic facial animation based on an articulatory DBN model and AAM features
【24h】

Speech driven photo realistic facial animation based on an articulatory DBN model and AAM features

机译:基于发音DBN模型和AAM功能的语音驱动的照片逼真的面部动画

获取原文
获取原文并翻译 | 示例
           

摘要

This paper presents a photo realistic facial animation synthesis approach based on an audio visual articulatory dynamic Bayesian network model (AF_AVDBN), in which the maximum asynchronies between the articulatory features, such as lips, tongue and glottis/-velum, can be controlled. Perceptual Linear Prediction (PLP) features from audio speech, as well as active appearance model (AAM) features from face images of an audio visual continuous speech database, are adopted to train the AF_AVDBN model parameters. Based on the trained model, given an input audio speech, the optimal AAM visual features are estimated via a maximum likelihood estimation (MLE) criterion, which are then used to construct face images for the animation. In our experiments, facial animations are synthesized for 20 continuous audio speech sentences, using the proposed AF_AVDBN model, as well as the state-of-art methods, being the audio visual state synchronous DBN model (SS_DBN) implementing a multi-stream Hidden Markov Model, and the state asynchronous DBN model (SA_DBN). Objective evaluations on the learned AAM features show that much more accurate visual features can be learned from the AF_AVDBN model. Subjective evaluations show that the synthesized facial animations using AF_AVDBN are better than those using the state based SA_DBN and SS_DBN models, in the overall naturalness and matching accuracy of the mouth movements to the speech content.
机译:本文提出了一种基于视听发音动态贝叶斯网络模型(AF_AVDBN)的照片逼真的面部动画合成方法,其中可以控制发音特征(如嘴唇,舌头和声门/-绒毛)之间的最大异步性。音频语音的感知线性预测(PLP)特征以及视听连续语音数据库的面部图像的主动外观模型(AAM)特征被用来训练AF_AVDBN模型参数。基于训练后的模型,给定输入音频语音,通过最大似然估计(MLE)准则估计最佳AAM视觉特征,然后将其用于构建动画的面部图像。在我们的实验中,使用拟议的AF_AVDBN模型以及最新方法,即实现多流隐藏Markov的视听状态同步DBN模型(SS_DBN),针对20个连续音频语音句子合成了面部动画模型,以及状态异步DBN模型(SA_DBN)。对学习的AAM特征的客观评估表明,可以从AF_AVDBN模型中学习到更多准确的视觉特征。主观评估表明,使用AF_AVDBN合成的面部动画要比使用基于状态的SA_DBN和SS_DBN模型的面部动画在嘴部运动与语音内容的整体自然度和匹配精度方面更好。

著录项

  • 来源
    《Multimedia Tools and Applications》 |2014年第1期|397-415|共19页
  • 作者单位

    School of Computer Science, Northwestern Polytechnical University, Xi'an 710072, People's Republic of China,Shaanxi Provincial Key Laboratory of Speech and Image Information Processing, Shaanxi, China;

    School of Computer Science, Northwestern Polytechnical University, Xi'an 710072, People's Republic of China,Shaanxi Provincial Key Laboratory of Speech and Image Information Processing, Shaanxi, China;

    Electronics & Informatics Department (ETRO), Vrije Universiteit Brussel, Brussels, Belgium,Interuniversity Microelectronics Center (IMEC), Leuven, Belgium;

    School of Computer Science, Northwestern Polytechnical University, Xi'an 710072, People's Republic of China,Shaanxi Provincial Key Laboratory of Speech and Image Information Processing, Shaanxi, China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Facial animation; AF_AVDBN; Asynchrony; AAM;

    机译:面部动画;AF_AVDBN;异步美国汽车协会;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号