基于CMAC网络Sarsa（λ）学习的RoboCup守门员策略

刘云龙; 吉国力

首页> 中文期刊> 《北京工业大学学报》 >基于CMAC网络Sarsa（λ）学习的RoboCup守门员策略

基于CMAC网络Sarsa（λ）学习的RoboCup守门员策略

开具论文收录证明 >>

期刊封面封底目录下载 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

RoboCup simulated soccer has a large and complex state space, at the same time the variables used for decision are usually continuous, that make it difficult for the agent to choose the optimal action. This paper presents the goalkeeper as a case study, based on CMAC neural network, the continuous state space is firstly generalized, and then the Sarsa （λ） learning algorithm is employed to find the optimal policy. The author empirically evaluated and compared the defending effect of the goalkeepers with different strategies. Simulation results show that the goalkeeper with the learning algorithm has better defending effect and its defending time increases obviously after a period of time.%针对RoboCup仿真组足球比赛场上状态复杂多变、同时供决策的信息大多为连续变量、智能体利用现有信息通常无法判断当前状态下最优动作的问题，以守门员为例，首先利用CMAC神经网络对连续状态空间泛化，然后在泛化后的状态上，采用Sarsa（λ）学习算法获取守门员的最优策略．通过在RoboCup仿真平台上进行仿真，实验结果表明，采用基于CMAC的Sarsa（λ）学习算法的守门员，经过一定时间的学习后，防守时间显著增长，防守效果明显优于其他算法，验证了本文所提方案的有效性．

著录项

来源
《北京工业大学学报》 |2012年第9期|1348-1352|共5页
作者
刘云龙; 吉国力;
展开▼
作者单位

厦门大学自动化系,厦门361005;

厦门大学自动化系,厦门361005;

展开▼
原文格式 PDF
正文语种 chi
中图分类自动推理、机器学习;
关键词
RoboCup仿真组足球比赛; CMAC神经网络; 泛化; Sarsa（λ）学习算法; 最优策略;

相似文献

中文文献
外文文献
专利

1. 基于场地划分的RoboCup中型组守门员动态策略防守 [J] . 邓本再 ,张中景 ,黄苗 . 科学技术与工程 . 2010,第016期
2. RoboCup中基于神经网络的阵型策略在线学习 [J] . 秦锋 ,赵真真 ,程泽凯 . 计算机与现代化 . 2013,第008期
3. 基于CMAC网络的迭代学习初始控制策略 [J] . 段晓燕 . 计算机应用 . 2010,第008期
4. 基于模糊神经网络Sarsa学习的多机器人任务分配 [J] . 陈夏冰 ,刘国栋 . 计算机应用与软件 . 2012,第012期
5. 基于组合神经网络的Sarsa(λ)学习算法 [J] . 殷苌茗 ,付超红 ,薛丽华 . 计算机工程与设计 . 2008,第022期
6. 基于模糊CMAC的强化学习在Robocup中的应用 [C] . 李真 ,吴定会 ,纪志成 . 2007中国控制与决策学术年会 . 2007
7. 基于Sarsa学习的TD-SCDMA/WLAN异构网络切换算法研究 [A] . 黄明和 . 2013

基于CMAC网络Sarsa（λ）学习的RoboCup守门员策略

摘要

著录项

相似文献

相关主题

期刊订阅