...
首页> 外文期刊>Mechatronics, IEEE/ASME Transactions on >Sequential -Learning With Kalman Filtering for Multirobot Cooperative Transportation
【24h】

Sequential -Learning With Kalman Filtering for Multirobot Cooperative Transportation

机译:卡尔曼滤波的序贯学习在多机器人协同运输中的应用

获取原文
获取原文并翻译 | 示例
           

摘要

This paper presents a modified, distributed $Q$-learning algorithm, termed as sequential $Q$-learning with Kalman filtering (SQKF), for decision making associated with multirobot cooperation. The SQKF algorithm developed here has the following characteristics. 1) The learning process is arranged in a sequential manner (i.e., the robots will not make decisions simultaneously, but in a predefined sequence) so as to promote cooperation among robots and reduce their $Q$-learning spaces. 2) A robot will not update its $Q$-values with observed global rewards. Instead, it will employ a specific Kalman filter to extract its real local reward from the global reward, thereby updating its $Q$-table with this local reward. The new SQKF algorithm is intended to solve two problems in multirobot $Q$ -learning: credit assignment and behavior conflicts. The detailed procedure of the SQKF algorithm is presented, and its application is illustrated using a prototype multirobot experimental system. The experimental results show that the algorithm has better performance than the conventional single-agent $Q$-learning algorithm or the team $Q$-learning algorithm in the multirobot domain.
机译:本文提出了一种改进的分布式$ Q $学习算法,称为带卡尔曼滤波的顺序$ Q $学习(SQKF),用于与多机器人合作相关的决策。这里开发的SQKF算法具有以下特征。 1)学习过程是按顺序排列的(即机器人不会同时做出决定,而是以预定的顺序进行决策),以促进机器人之间的合作并减少他们的$ Q $学习空间。 2)机器人不会使用观察到的全球奖励来更新其$ Q $值。取而代之的是,它将使用特定的卡尔曼滤波器从全局奖励中提取其实际的本地奖励,从而使用此本地奖励更新其$ Q $表。新的SQKF算法旨在解决多机器人$ Q $学习中的两个问题:信用分配和行为冲突。给出了SQKF算法的详细过程,并使用原型多机器人实验系统说明了其应用。实验结果表明,该算法在多机器人领域比常规的单代理$ Q $学习算法或团队$ Q $学习算法具有更好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号