首页> 外文期刊>Neural Networks: The Official Journal of the International Neural Network Society >Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems.
【24h】

Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems.

机译:神经网络方法用于部分未知非线性系统的连续时间直接自适应最优控制。

获取原文
获取原文并翻译 | 示例
       

摘要

In this paper we present in a continuous-time framework an online approach to direct adaptive optimal control with infinite horizon cost for nonlinear systems. The algorithm converges online to the optimal control solution without knowledge of the internal system dynamics. Closed-loop dynamic stability is guaranteed throughout. The algorithm is based on a reinforcement learning scheme, namely Policy Iterations, and makes use of neural networks, in an Actor/Critic structure, to parametrically represent the control policy and the performance of the control system. The two neural networks are trained to express the optimal controller and optimal cost function which describes the infinite horizon control performance. Convergence of the algorithm is proven under the realistic assumption that the two neural networks do not provide perfect representations for the nonlinear control and cost functions. The result is a hybrid control structure which involves a continuous-time controller and a supervisory adaptation structure which operates based on data sampled from the plant and from the continuous-time performance dynamics. Such control structure is unlike any standard form of controllers previously seen in the literature. Simulation results, obtained considering two second-order nonlinear systems, are provided.
机译:在本文中,我们在连续时间框架中提出了一种在线方法,该方法可以直接为非线性系统提供具有无限期成本的自适应最优控制。该算法在线收敛到最优控制解决方案,而无需了解内部系统动力学。始终保证闭环动态稳定性。该算法基于强化学习方案(即策略迭代),并在Actor / Critic结构中利用神经网络以参数形式表示控制策略和控制系统的性能。训练这两个神经网络来表达描述无限水平控制性能的最优控制器和最优成本函数。在两个神经网络不能为非线性控制和成本函数提供完美表示的现实假设下,证明了算法的收敛性。结果是一种混合控制结构,其中包括一个连续时间控制器和一个监控自适应结构,该结构基于从工厂采样的数据和连续时间性能动态进行操作。这种控制结构不同于先前在文献中看到的任何标准形式的控制器。提供了考虑两个二阶非线性系统而获得的仿真结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号