首页> 美国卫生研究院文献>other >Neural speech recognition: Continuous phoneme decoding using spatiotemporal representations of human cortical activity
【2h】

Neural speech recognition: Continuous phoneme decoding using spatiotemporal representations of human cortical activity

机译:神经语音识别:使用人类皮层活动的时空表示进行连续音素解码

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The superior temporal gyrus (STG) and neighboring brain regions play a key role in human language processing. Previous studies have attempted to reconstruct speech information from brain activity in the STG, but few of them incorporate the probabilistic framework and engineering methodology used in modern speech recognition systems. In this work, we describe the initial efforts toward the design of a neural speech recognition (NSR) system that performs continuous phoneme recognition on English stimuli with arbitrary vocabulary sizes using the high gamma band power of local field potentials in the STG and neighboring cortical areas obtained via electrocorticography. The system implements a Viterbi decoder that incorporates phoneme likelihood estimates from a linear discriminant analysis model and transition probabilities from an n-gram phonemic language model. Grid searches were used in an attempt to determine optimal parameterizations of the feature vectors and Viterbi decoder. The performance of the system was significantly improved by using spatiotemporal representations of the neural activity (as opposed to purely spatial representations) and by including language modeling and Viterbi decoding in the NSR system. These results emphasize the importance of modeling the temporal dynamics of neural responses when analyzing their variations with respect to varying stimuli and demonstrate that speech recognition techniques can be successfully leveraged when decoding speech from neural signals. Guided by the results detailed in this work, further development of the NSR system could have applications in the fields of automatic speech recognition and neural prosthetics.
机译:颞上回(STG)和邻近的大脑区域在人类语言处理中起着关键作用。先前的研究试图从STG中的大脑活动中重建语音信息,但是很少有研究结合了现代语音识别系统中使用的概率框架和工程方法。在这项工作中,我们描述了对神经语音识别(NSR)系统设计的初步努力,该系统使用STG和邻近皮层区域的局部场电势的高伽马谱带功率对具有任意词汇量的英语刺激进行连续音素识别通过脑电图获得。该系统实现了维特比解码器,该解码器结合了来自线性判别分析模型的音素似然估计和来自n语法音素语言模型的转换概率。尝试使用网格搜索来确定特征向量和维特比解码器的最佳参数化。通过使用神经活动的时空表示(与纯粹的空间表示相反),并且在NSR系统中包括语言建模和Viterbi解码,可以显着提高系统的性能。这些结果强调了在分析神经刺激相对于变化刺激的变化时,对神经动力学的时间动态建模的重要性,并证明了从神经信号解码语音时,可以成功利用语音识别技术。根据这项工作详细介绍的结果,NSR系统的进一步开发可以在自动语音识别和神经修复领域中得到应用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号