首页> 外文会议>ASME Artificial Neural Networks in Engineering Conference >To explore continuous action space in actor/critic architecture -a preliminary study
【24h】

To explore continuous action space in actor/critic architecture -a preliminary study

机译:探索演员/评论仪建筑的连续动作空间 - 初步研究

获取原文

摘要

Recently, a popular topic in the study of Reinforcement Learning is how to extend the RL into the continuous state space and action space, so that RL can be applied to more real world problems. The ASE/ACE. which is one of the most famous implementations of RL. shows the possibility to be one solution. However, the research effort should be done to organize both state and action space to reduce the indefinite searching to definite. There exists few RL system exploring the action space by combining the effective action sequences to catch the regularity of the environment and thus to be reusable. In this paper. we proposed a memory-based sequence structure. and correspondingly an adaptive action sequence Critic to the Actor/Critic architecture to organize the action space. Experiments to solve a benchmark double integrator problem and a 2-dirmensional complicated double integrator problem are carried out to show the effectiveness of the new architecture.
机译:最近,在加固学习研究中的一个流行课题是如何将RL扩展到连续的状态空间和动作空间中,以便可以应用于更真实的世界问题。 ASE / ACE。这是RL最着名的实现之一。显示有可能成为一种解决方案。但是,应采取研究努力来组织国家和行动空间,以减少对确定的无限期。通过组合有效动作序列来捕获环境的规律性,因此存在很少的RL系统,从而可重复使用。在本文中。我们提出了一种基于内存的序列结构。并相应地对演员/批评架构组织动作空间的自适应行动序列批评批评。进行了解基准双积分器问题的实验和2潜能复杂的双积分器问题,以显示新架构的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号