首页> 外文期刊>Soft computing: A fusion of foundations, methodologies and applications >Heuristic dynamic programming with internal goal representation
【24h】

Heuristic dynamic programming with internal goal representation

机译:具有内部目标表示的启发式动态规划

获取原文
获取原文并翻译 | 示例
           

摘要

In this paper, we analyze an internal goal structure based on heuristic dynamic programming, named GrHDP, to tackle the 2-D maze navigation problem. Classical reinforcement learning approaches have been introduced to solve this problem in literature, yet no intermediate reward has been assigned before reaching the final goal. In this paper, we integrated one additional network, namely goal network, into the traditional heuristic dynamic programming (HDP) design to provide the internal reward/goal representation. The architecture of our proposed approach is presented, followed by the simulation of 2-D maze navigation (10*10) problem. For fair comparison, we conduct the same simulation environment settings for the traditional HDP approach. Simulation results show that our proposed GrHDP can obtain faster convergent speed with respect to the sum of square error, and also achieve lower error eventually.
机译:在本文中,我们分析了基于启发式动态规划的内部目标结构GrHDP,以解决二维迷宫导航问题。为了解决文学中的这一问题,引入了经典的强化学习方法,但是在达到最终目标之前尚未分配任何中间奖励。在本文中,我们将另外一个网络(目标网络)集成到传统的启发式动态规划(HDP)设计中,以提供内部奖励/目标表示。介绍了我们提出的方法的体系结构,然后模拟了二维迷宫导航(10 * 10)问题。为了公平地比较,我们对传统的HDP方法进行相同的仿真环境设置。仿真结果表明,相对于平方误差之和,我们提出的GrHDP收敛速度更快,最终误差也较小。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号