首页> 美国卫生研究院文献>Sensors (Basel Switzerland) >UAV Autonomous Tracking and Landing Based on Deep Reinforcement Learning Strategy
【2h】

UAV Autonomous Tracking and Landing Based on Deep Reinforcement Learning Strategy

机译:基于深度加强学习策略的无人机自主追踪与降落

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Unmanned aerial vehicle (UAV) autonomous tracking and landing is playing an increasingly important role in military and civil applications. In particular, machine learning has been successfully introduced to robotics-related tasks. A novel UAV autonomous tracking and landing approach based on a deep reinforcement learning strategy is presented in this paper, with the aim of dealing with the UAV motion control problem in an unpredictable and harsh environment. Instead of building a prior model and inferring the landing actions based on heuristic rules, a model-free method based on a partially observable Markov decision process (POMDP) is proposed. In the POMDP model, the UAV automatically learns the landing maneuver by an end-to-end neural network, which combines the Deep Deterministic Policy Gradients (DDPG) algorithm and heuristic rules. A Modular Open Robots Simulation Engine (MORSE)-based reinforcement learning framework is designed and validated with a continuous UAV tracking and landing task on a randomly moving platform in high sensor noise and intermittent measurements. The simulation results show that when the moving platform is moving in different trajectories, the average landing success rate of the proposed algorithm is about 10% higher than that of the Proportional-Integral-Derivative (PID) method. As an indirect result, a state-of-the-art deep reinforcement learning-based UAV control method is validated, where the UAV can learn the optimal strategy of a continuously autonomous landing and perform properly in a simulation environment.
机译:无人驾驶飞行器(UAV)自主追踪和着陆正在军事和民用应用中发挥着越来越重要的作用。特别是,已经成功地引入了与机器人有关的任务的机器学习。本文提出了一种基于深度加强学习策略的新型无人机自主追踪和着陆方法,目的是在不可预测和严酷的环境中处理无人机运动控制问题。提出了一种基于启发式规则来推断出基于启发式规则的预测动作,而不是基于局部观察到的马尔可夫决策过程(POMDP)的模型方法而不是构建先前的模型。在POMDP模型中,UAV通过端到端神经网络自动学习着陆机构,该网络结合了深度确定性政策梯度(DDPG)算法和启发式规则。基于模块化开放机器人仿真发动机(MORSE)的加固学习框架是在高传感器噪声和间歇测量的随机移动平台上的连续无人机跟踪和着陆任务设计和验证。仿真结果表明,当移动平台在不同的轨迹中移动时,所提出的算法的平均着陆成功率比比例积分(PID)方法高约10%。作为间接结果,验证了基于最先进的深增强学习的UAV控制方法,其中UAV可以学习连续自主着陆的最佳策略,并在模拟环境中正确执行。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号