...
首页> 外文期刊>The International journal of robotics research >Distributed multi-robot collision avoidance via deep reinforcement learning for navigation in complex scenarios
【24h】

Distributed multi-robot collision avoidance via deep reinforcement learning for navigation in complex scenarios

机译:通过深度加强学习在复杂场景中导航的深度增强学习分布式多机器人碰撞避免

获取原文
获取原文并翻译 | 示例
           

摘要

Developing a safe and efficient collision-avoidance policy for multiple robots is challenging in the decentralized scenarios where each robot generates its paths with limited observation of other robots' states and intentions. Prior distributed multi-robot collision-avoidance systems often require frequent inter-robot communication or agent-level features to plan a local collision-free action, which is not robust and computationally prohibitive. In addition, the performance of these methods is not comparable with their centralized counterparts in practice. In this article, we present a decentralized sensor-level collision-avoidance policy for multi-robot systems, which shows promising results in practical applications. In particular, our policy directly maps raw sensor measurements to an agent's steering commands in terms of the movement velocity. As a first step toward reducing the performance gap between decentralized and centralized methods, we present a multi-scenario multi-stage training framework to learn an optimal policy. The policy is trained over a large number of robots in rich, complex environments simultaneously using a policy-gradient-based reinforcement-learning algorithm. The learning algorithm is also integrated into a hybrid control framework to further improve the policy's robustness and effectiveness. We validate the learned sensor-level collision-3avoidance policy in a variety of simulated and real-world scenarios with thorough performance evaluations for large-scale multi-robot systems. The generalization of the learned policy is verified in a set of unseen scenarios including the navigation of a group of heterogeneous robots and a large-scale scenario with 100 robots. Although the policy is trained using simulation data only, we have successfully deployed it on physical robots with shapes and dynamics characteristics that are different from the simulated agents, in order to demonstrate the controller's robustness against the simulation-to-real modeling error. Finally, we show that the collision-avoidance policy learned from multi-robot navigation tasks provides an excellent solution for safe and effective autonomous navigation for a single robot working in a dense real human crowd. Our learned policy enables a robot to make effective progress in a crowd without getting stuck. More importantly, the policy has been successfully deployed on different types of physical robot platforms without tedious parameter tuning.
机译:开发用于多个机器人的安全有效的碰撞措施是在分散的场景中具有挑战性,其中每个机器人产生其路径,其路径有限地观察其他机器人的状态和意图。现有分布式多机器人碰撞 - 避免系统通常需要频繁的机器人间通信或代理级别功能来规划本地的碰撞动作,这不是强大和计算的禁止。此外,这些方法的性能与实际中的集中式对应物没有比较。在本文中,我们为多机器人系统提供了分散的传感器级碰撞措施,这表明了在实际应用中的有希望的结果。特别是,我们的政策在运动速度方面直接将原始传感器测量结果映射到代理的转向命令。作为降低分散和集中方法之间的性能差距的第一步,我们提出了一个多场景多级训练框架来学习最佳政策。该政策在丰富的复杂环境中使用了大量的机器人,同时使用基于策略梯度的强化学习算法。学习算法还集成到混合控制框架中,以进一步提高政策的稳健性和有效性。我们在各种模拟和现实情景中验证了学习的传感器级碰撞3AVoidance政策,为大型多机器人系统进行了彻底的性能评估。学习策略的概括在一组不均义情景中验证,包括一组异构机器人的导航和具有100个机器人的大规模场景。虽然策略仅使用仿真数据训练,但我们已成功地将其部署在具有与模拟代理不同的形状和动态特性的物理机器人上,以便展示控制器对仿真到实用建模错误的鲁棒性。最后,我们表明,从多机器人导航任务中学到的碰撞避免政策为单个机器人工作的安全有效的自主导航提供了优异的解决方案,以便在密集的真人人群中工作。我们的学习政策使机器人能够在不陷入困境的情况下在人群中取得有效进步。更重要的是,该策略已成功部署在不同类型的物理机器人平台上,没有繁琐的参数调整。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号