首页> 外文期刊>The European Journal of Neuroscience >Generalization of value in reinforcement learning by humans
【24h】

Generalization of value in reinforcement learning by humans

机译:强化学习中的价值概括

获取原文
获取原文并翻译 | 示例
           

摘要

Research in decision-making has focused on the role of dopamine and its striatal targets in guiding choices via learned stimulus-reward or stimulus-response associations, behavior that is well described by reinforcement learning theories. However, basic reinforcement learning is relatively limited in scope and does not explain how learning about stimulus regularities or relations may guide decision-making. A candidate mechanism for this type of learning comes from the domain of memory, which has highlighted a role for the hippocampus in learning of stimulus-stimulus relations, typically dissociated from the role of the striatum in stimulus-response learning. Here, we used functional magnetic resonance imaging and computational model-based analyses to examine the joint contributions of these mechanisms to reinforcement learning. Humans performed a reinforcement learning task with added relational structure, modeled after tasks used to isolate hippocampal contributions to memory. On each trial participants chose one of four options, but the reward probabilities for pairs of options were correlated across trials. This (uninstructed) relationship between pairs of options potentially enabled an observer to learn about option values based on experience with the other options and to generalize across them. We observed blood oxygen level-dependent (BOLD) activity related to learning in the striatum and also in the hippocampus. By comparing a basic reinforcement learning model to one augmented to allow feedback to generalize between correlated options, we tested whether choice behavior and BOLD activity were influenced by the opportunity to generalize across correlated options. Although such generalization goes beyond standard computational accounts of reinforcement learning and striatal BOLD, both choices and striatal BOLD activity were better explained by the augmented model. Consistent with the hypothesized role for the hippocampus in this generalization, functional connectivity between the ventral striatum and hippocampus was modulated, across participants, by the ability of the augmented model to capture participants' choice. Our results thus point toward an interactive model in which striatal reinforcement learning systems may employ relational representations typically associated with the hippocampus.
机译:决策方面的研究集中在多巴胺及其纹状体目标在通过学习的刺激-奖励或刺激-响应关联指导选择的作用中,强化学习理论很好地描述了这种行为。但是,基本强化学习的范围相对有限,并且不能解释关于刺激规律性或关系的学习如何指导决策。这种类型的学习的候选机制来自记忆域,它突出了海马在学习刺激-刺激关系中的作用,通常与纹状体在刺激-反应学习中的作用无关。在这里,我们使用了功能磁共振成像和基于计算模型的分析来检验这些机制对强化学习的共同贡献。人类执行了具有附加关系结构的强化学习任务,该任务是根据用于隔离海马对记忆的贡献的任务建模的。在每个试验中,参与者都从四个选项中选择一个,但是选项对的奖励概率在各个试验之间是相关的。选项对之间的这种(非指示性的)关系可能使观察者可以根据对其他选项的经验来了解选项值并在其他选项上进行归纳。我们观察到与纹状体以及海马中的学习有关的血氧水平依赖性(BOLD)活性。通过将一种基本强化学习模型与一种强化学习模型进行比较,以使反馈能够在相关选项之间进行概括,我们测试了选择行为和BOLD活动是否受到在相关选项之间进行概括的机会的影响。尽管这种概括超出了强化学习和纹状体BOLD的标准计算范围,但是增强模型可以更好地解释选择和纹状体BOLD活动。与假定的海马在这一概括中的作用一致,通过增强模型捕获参与者选择的能力,调节了参与者之间腹侧纹状体和海马之间的功能连接。因此,我们的结果指向一种交互式模型,其中纹状体强化学习系统可以采用通常与海马体相关的关系表示。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号