首页> 外文期刊>International Journal of Robotics & Automation >SOFT ACTOR-CRITIC REINFORCEMENT LEARNING FOR ROBOTIC MANIPULATOR WITH HINDSIGHT EXPERIENCE REPLAY
【24h】

SOFT ACTOR-CRITIC REINFORCEMENT LEARNING FOR ROBOTIC MANIPULATOR WITH HINDSIGHT EXPERIENCE REPLAY

机译:软电演位批评机器人机器人与后勤体验重播的批评

获取原文
获取原文并翻译 | 示例
           

摘要

The key challenges in applying reinforcement learning (RL) to complex robotic control tasks are the fragile convergence property, very high sample complexity and the need to shape a reward function. In this work, we present a soft actor-critic (SAC) style algorithm, an off-policy actor-critic RL method based on the maximum entropy RL framework, where the objective of the actor is to maximize the expected reward while also maximizing the entropy. This effectively improves the stability of the performance of algorithm and the robustness to modelling and estimate error. Moreover, we combine SAC with a new transition replay scheme called hindsight experience replay so as to make policy learning more efficient from sparse rewards. Finally, the effectiveness of the proposed method is verified on a range of manipulation tasks in simulated environment.
机译:将强化学习(RL)应用于复杂的机器人控制任务的关键挑战是脆弱的收敛性,非常高的样本复杂性,并且需要塑造奖励功能。 在这项工作中,我们提出了一种软演员 - 评论家(SAC)风格算法,这是一种基于最大熵RL框架的禁止策略演员 - 评论家RL方法,其中演员的目标是最大化预期的奖励,同时也最大化 熵。 这有效地提高了算法性能的稳定性和建模和估计误差的鲁棒性。 此外,我们将SAC与一个名为Hindsight体验重放的新转换重播方案组合,以便从稀疏奖励中更高效地制作策略学习。 最后,在模拟环境中的一系列操作任务中验证了所提出的方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号