首页> 外文期刊>Adaptive Behavior >Reinforcement Learning for RoboCup Soccer Keepaway
【24h】

Reinforcement Learning for RoboCup Soccer Keepaway

机译:RoboCup足球禁区的强化学习

获取原文
获取原文并翻译 | 示例
           

摘要

RoboCup simulated soccer presents many challenges to reinforcement learning methods, including a large state space, hidden and uncertain state, multiple independent agents learning simultaneously, and long and variable delays in the effects of actions. We describe our application of episodic SMDP Sarsa(λ) with linear tile-coding function approximation and variable λ to learning higher-level decisions in a keepaway subtask of RoboCup soccer. In keepaway, one team, "the keepers," tries to keep control of the ball for as long as possible despite the efforts of "the takers." The keepers learn individually when to hold the ball and when to pass to a teammate. Our agents learned policies that significantly outperform a range of benchmark policies. We demonstrate the generality of our approach by applying it to a number of task variations including different field sizes and different numbers of players on each team.
机译:RoboCup模拟足球对强化学习方法提出了许多挑战,包括较大的状态空间,隐藏和不确定的状态,多个独立的代理同时学习以及动作效果的长时间和可变延迟。我们描述了带有线性瓦片编码函数逼近和变量λ的情节式SMDP Sarsa(λ)在学习RoboCup足球的子任务中的高层决策中的应用。在收球时,尽管“收球者”做出了努力,但一支球队“收球者”会尽力保持对球的控制。守门员会分别学习何时握球以及何时传递给队友。我们的代理商了解到的政策明显优于一系列基准政策。通过将其应用于许多任务变体,包括每个团队的不同领域规模和不同人数的球员,我们展示了我们方法的一般性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号