Reinforcement Learning based Decoding Using Internal Reward for Time Delayed Task in Brain Machine Interfaces

机译：在脑机接口中使用针对内部任务的延时奖励的基于强化学习的解码

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Reinforcement learning (RL) algorithm interprets neural signals into movement intentions with the guidance of the reward in Brain-machine interfaces (BMIs). Current RL algorithms generally work for the tasks with immediate rewards delivery, and lack of efficiency in delayed reward task. Prefrontal cortex, including medial prefrontal cortex(mPFC), has been demonstrated to assign credit to intermediate steps, which reinforces preceding action more efficiently. In this paper, we propose to simulate the functionality of mPFC activities as intermediate rewards to train a RL based decoder in a two-step movement task. A support vector machine (SVM) is adopted to verify if the subject expects a reward due to the accomplishment of a subtask from mPFC activity. Then this discrimination result will be utilized to guide the training of the RL decoder for each step respectively. Here, we apply the Sarsa-style attention-gated reinforcement learning (SAGREL) as the decoder to interpret motor cortex(M1) activity to action states. We test on in vivo primary motor cortex (M1) and mPFC data collected from rats, where the rats need to first trigger the start and then press lever for rewards using M1 signals. SAGREL using intermediate rewards from mPFC activities achieves a prediction accuracy of 66.8% ± 2.0.% (mean ± std) %, which is significantly better than the one using the reward by the end of trial (45.9.% ± 1.2%). This reveals the potentials of modelling mPFC activities as intermediate rewards for the delayed reward tasks.

机译：强化学习（RL）算法在脑机接口（BMI）中的奖励指导下将神经信号解释为运动意图。当前的RL算法通常可用于具有立即奖励交付且延迟奖励任务效率低下的任务。前额叶皮层，包括内侧前额叶皮层（mPFC），已被证明可将功劳分配给中间步骤，从而更有效地增强了先前的动作。在本文中，我们建议将mPFC活动的功能作为中间奖励来模拟，以在两步运动任务中训练基于RL的解码器。采用支持向量机（SVM）来验证受试者是否由于完成mPFC活动的子任务而期望获得奖励。然后，该鉴别结果将被用来分别指导针对每个步骤的RL解码器的训练。在这里，我们将Sarsa风格的注意门控强化学习（SAGREL）用作解码器，以将运动皮质（M1）活动解释为动作状态。我们测试了从大鼠收集的体内初级运动皮层（M1）和mPFC数据，在这种情况下，大鼠需要首先触发开始，然后使用M1信号按杠杆以获得奖励。使用来自mPFC活动的中间奖励的SAGREL可以达到66.8％±2.0。％（平均值±标准差）％的预测准确度，这比使用试验结束时使用奖励的预测准确度（45.9。％±1.2％）要好得多。这揭示了将mPFC活动建模为延迟奖励任务的中间奖励的潜力。

著录项

来源
《Annual International Conference of the IEEE Engineering in Medicine and Biology Society》|2020年|3351-3354|共4页
会议地点
作者
Xiang Shen; Xiang Zhang; Yifan Huang; Shuhang Chen; Yiwen Wang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Task analysis; Decoding; Rats; Support vector machines; Presses; Firing; Learning (artificial intelligence);

机译：任务分析;解码;评级;支持向量机;印刷;射击;学习（人工智能）;

相似文献

外文文献
中文文献
专利

1. Task Learning Over Multi-Day Recording via Internally Rewarded Reinforcement Learning Based Brain Machine Interfaces [J] . Shen Xiang, Zhang Xiang, Huang Yifan, IEEE transactions on neural systems and rehabilitation engineering . 2020,第12期

机译：通过基于内部奖励加强学习的脑机接口，任务学习多日录制
2. Quantized Attention-Gated Kernel Reinforcement Learning for Brain–Machine Interface Decoding [J] . Fang Wang, Yiwen Wang, Kai Xu, Neural Networks and Learning Systems, IEEE Transactions on . 2017,第4期

机译：脑机接口解码的量化注意门控核增强学习
3. Toward an Autonomous Brain Machine Interface: Integrating Sensorimotor Reward Modulation and Reinforcement Learning [J] . Marsh Brandi T., Tarigoppula Venkata S. Aditya, Chen Chen, The Journal of Neuroscience: The Official Journal of the Society for Neuroscience . 2015,第19期

机译：朝着自主脑机接口：集成传感器奖励调制和强化学习
4. A Weight Transfer Mechanism for Kernel Reinforcement Learning Decoding in Brain-Machine Interfaces [C] . Xiang Zhang, Yiwen Wang Annual International Conference of the IEEE Engineering in Medicine and Biology Society . 2019

机译：脑机界面中内核强化学习解码的权重传递机制
5. Changing the brain -machine interface paradigm: Co-adaptation based on reinforcement learning. [D] . Digiovanna, John F. 2008

机译：改变脑机接口范例：基于强化学习的共同适应。
6. Reinforcement Learning Based Fast Self-Recalibrating Decoder for Intracortical Brain–Machine Interface [O] . Peng Zhang, Lianying Chao, Yuting Chen, 2020

机译：基于加强学习的内部脑机接口快速自我重新校准解码器
7. A 128 channel Extreme Learning Machine based Neural Decoder for Brain Machine Interfaces [O] . Chen, Yi, Yao, Enyi, Basu, Arindam 2015

机译：基于128通道极限学习机的脑神经解码器机器接口
8. Learning from Noisy and Delayed Rewards: The Value of Reinforcement Learning to Defense Modeling and Simulation. [R] . Alt, J. K. 2012

机译：学习嘈杂和延迟奖励：强化学习对国防建模和仿真的价值。

Reinforcement Learning based Decoding Using Internal Reward for Time Delayed Task in Brain Machine Interfaces

摘要

著录项

相似文献

相关主题

期刊订阅