首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >Self-Inference Of Others’ Policies For Homogeneous Agents In Cooperative Multi-Agent Reinforcement Learning
【24h】

Self-Inference Of Others’ Policies For Homogeneous Agents In Cooperative Multi-Agent Reinforcement Learning

机译:合作多智能经纪人加固学习中众所周知的均质代理商的自我推断

获取原文

摘要

Multi-agent reinforcement learning (MARL) has been widely applied in various cooperative tasks, where multiple agents are trained to collaboratively achieve global goals. During the training stage of MARL, inferring policies of other agents is able to improve the coordination efficiency. However, most of the existing policy inference methods require each agent to model all other agents separately, which results in quadratic growth of resource consumption as the number of agents increases. In addition, inferring the policy of an agent solely from its observations and actions may lead to failure of agent modeling. To address this issue, we propose to let each agent infer the others’ policies with its own model, given that the agents are homogeneous. This self-inference approach significantly reduces the computation and storage consumption, and guarantees the quality of agent modeling. Experimental results demonstrate effectiveness of the proposed approach.
机译:多智能体增强学习(MARL)已广泛应用于各种合作任务,其中多个代理人受过培训以协作实现全球目标。 在Marl的培训阶段,推断其他代理商的政策能够提高协调效率。 然而,大多数现有的政策推断方法都需要各种代理商分别模拟所有其他代理,这导致资源消耗的二次生长,因为代理的数量增加。 此外,仅从其观察和行动中推断出代理人的政策可能导致代理商建模失败。 为了解决这个问题,我们建议让每个代理商通过自己的模型推断其他人的政策,因为代理商是均匀的。 这种自动推断方法显着降低了计算和存储消耗,并保证了代理建模的质量。 实验结果表明了提出的方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号