...
首页> 外文期刊>Journal of aerospace engineering >Coordinated Control Based on Reinforcement Learning for Dual-Arm Continuum Manipulators in Space Capture Missions
【24h】

Coordinated Control Based on Reinforcement Learning for Dual-Arm Continuum Manipulators in Space Capture Missions

机译:基于加固学习的适应性控制在太空捕获任务中的双臂连续管道机构

获取原文
获取原文并翻译 | 示例
           

摘要

The increasing number of defunct and fragmented spacecraft poses a growing hazard to existing onorbit assets. The redundant continuum manipulator with high flexibility provides dual-arm robotic systems with apparent advantages in active debris removal missions in space. Existing autonomously-coordinated control approaches for dual-arm continuum manipulators require a real-time inverse kinematic solution and a security assurance mechanism for possible collisions, which are difficult to upscale for space debris capture systems with high-speed maneuverability. In this paper, we consider collision avoidance and input saturation control in proposing a multiagent reinforcement learning approach, named the multiagent twin delayed deep deterministic policy gradient (MATD3), to generate a real-time inverse kinematic solution for coordinated manipulators. During the training process, the MATD3 algorithm performs lower overestimation than the multiagent deep deterministic policy gradient (MADDPG) algorithm. Then, a feedback dynamics controller is designed for the continuum manipulators. Under the guidance of the policy networks, each agent can schedule the joint trajectory design online according to the collaborator and target debris information. During the capture operation, a competitive mechanism for the anticollision function is developed through reasonable reward functions to maintain dual arms at a safe distance. Simulation results show that the average accuracy of the proposed approach is 42% higher than that of MADDPG in inverse kinematic trajectory planning. The designed integrated tracking controller can effectively perform capture missions in the simulation environment. Multiagent reinforcement learning shows promise for future onorbit servicing missions.
机译:越来越多的废除和碎片的航天器对现有的onorbit资产带来了日益严重的危险。具有高柔韧性的冗余连续管道机器提供双臂机器人系统,在空间中的活性碎片清除任务中具有明显的优点。对双臂连续管道的现有自主协调方法需要实时反向运动解决方案和用于可能碰撞的安全保证机制,这对于具有高速机动性的空间碎片捕获系统难以高档。在本文中,我们考虑碰撞避免和输入饱和度控制在提出多读加强学习方法时,命名为多算法双延迟深度确定性政策梯度(MATD3),以生成用于协调的机械手的实时逆运动学解决方案。在培训过程中,MATD3算法比多算法深度确定性政策梯度(MADDPG)算法进行较低的高度高度估计。然后,设计反馈动态控制器用于连续管道操纵器。在策略网络的指导下,每个代理人可以根据合作者和目标碎片信息在线安排联合轨迹设计。在捕获操作期间,通过合理奖励功能开发了一种竞争机制,以保持安全距离的双臂。仿真结果表明,在逆运动轨迹规划中,所提出的方法的平均准确性高于Maddpg的42%。设计的集成跟踪控制器可以有效地在仿真环境中执行捕获任务。多读强化学习显示未来的onorbit服务任务的承诺。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号