A New Manifold Representation for Visual Speech Recognition

机译：视觉语音识别的新流形表示

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we propose a new manifold representation capable of being applied for visual speech recognition. In this regard, the real time input video data is compressed using Principal Component Analysis (PCA) and the low-dimensional points calculated for each frame define the manifolds. Since the number of frames that from the video sequence is dependent on the word complexity, in order to use these manifolds for visual speech classification it is required to re-sample them into a fixed number of keypoints that are used as input for classification. In this paper two classification schemes, namely the k Nearest Neighbour (kNN) algorithm that is used in conjunction with the two-stage PCA and Hidden-Markov-Model (HMM) classifier are evaluated. The classification results for a group of English words indicate that the proposed approach is able to produce accurate classification results.

机译：在本文中，我们提出了一种新的流形表示形式，可以应用于视觉语音识别。在这方面，实时输入视频数据使用主成分分析（PCA）进行压缩，并且为每个帧计算的低维点定义了歧管。由于来自视频序列的帧数取决于单词的复杂程度，因此为了将这些流形用于视觉语音分类，需要将它们重新采样为固定数量的关键点，这些关键点用作分类的输入。本文评估了两种分类方案，即与两级PCA和隐马尔可夫模型（HMM）分类器结合使用的k最近邻（kNN）算法。一组英语单词的分类结果表明，该方法能够产生准确的分类结果。

著录项

来源
《International Conference on Computer Analysis of Images and Patterns(CAIP 2007); 20070827-29; Vienna(AT)》|2007年|P.374-382|共9页
会议地点 Vienna(AT)
作者
Dahai Yu; Ovidiu Ghita; Alistair Sutherland; Paul F. Whelan;
展开▼
作者单位

School of Computing Electronic Engineering, Vision Systems Group Dublin City University, Dublin 9, Ireland;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类信息处理（信息加工）;
关键词
visual speech recognition; PCA manifolds; spline interpolation; k-nearest neighbour; hidden markov model;

机译：视觉语音识别； PCA流形；样条插值； k近邻；隐马尔可夫模型;

相似文献

外文文献
中文文献
专利

1. Exploration of Properly Combined Audiovisual Representation with the Entropy Measure in Audiovisual Speech Recognition [J] . Vakhshiteh Fatemeh, Almasganj Farshad Circuits, systems, and signal processing . 2019,第6期

机译：视听语音识别中正确结合视听表示与熵测度的探索
2. Exploration of Properly Combined Audiovisual Representation with the Entropy Measure in Audiovisual Speech Recognition [J] . Vakhshiteh Fatemeh, Almasganj Farshad Circuits, systems, and signal processing . 2019,第6期

机译：视听识别熵措施勘探适当组合的视听表演
3. Multistream sparse representation features for noise robust audio-visual speech recognition [J] . Peng Shen, Satoru Hayamizu, Satoshi Tamura Acoustical science and technology . 2014,第1期

机译：多流稀疏表示功能可实现强大的抗噪视听语音识别
4. A New Manifold Representation for Visual Speech Recognition [C] . Dahai Yu, Ghita, O., Machine Vision and Image Processing Conference, 2007 International . 2007

机译：视觉语音识别的新流形表示
5. Robust speech processing based on microphone array, audio-visual, and frame selection for in-vehicle speech recognition and in-set speaker recognition. [D] . Zhang, Xianxian. 2005

机译：基于麦克风阵列，视听和帧选择的强大语音处理功能，可实现车载语音识别和内置说话人识别。
6. Visual Word Recognition is Accompanied by Covert Articulation: Evidence for a Speech-like Phonological Representation [O] . Brianna M. Eiter, Albrecht W. Inhoff -1

机译：视觉单词识别与隐蔽发音相伴：类似语音的语音表示形式的证据
7. A PCA based Manifold Representation for Visual Speech Recognition [O] . Dahai Yu, Ovidiu Ghita, Alistair Sutherl, 2014

机译：基于pCa的视觉语音识别流形表示

A New Manifold Representation for Visual Speech Recognition

摘要

著录项

相似文献

相关主题

期刊订阅