Actor–Critic-Based Optimal Tracking for Partially Unknown Nonlinear Discrete-Time Systems

Kiumarsi B.; Lewis F.L.

首页> 外文期刊>Neural Networks and Learning Systems, IEEE Transactions on >Actor–Critic-Based Optimal Tracking for Partially Unknown Nonlinear Discrete-Time Systems

【24h】

Actor–Critic-Based Optimal Tracking for Partially Unknown Nonlinear Discrete-Time Systems

机译：基于Actor-Critic的部分未知非线性离散时间最优跟踪

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper presents a partially model-free adaptive optimal control solution to the deterministic nonlinear discrete-time (DT) tracking control problem in the presence of input constraints. The tracking error dynamics and reference trajectory dynamics are first combined to form an augmented system. Then, a new discounted performance function based on the augmented system is presented for the optimal nonlinear tracking problem. In contrast to the standard solution, which finds the feedforward and feedback terms of the control input separately, the minimization of the proposed discounted performance function gives both feedback and feedforward parts of the control input simultaneously. This enables us to encode the input constraints into the optimization problem using a nonquadratic performance function. The DT tracking Bellman equation and tracking Hamilton–Jacobi–Bellman (HJB) are derived. An actor–critic-based reinforcement learning algorithm is used to learn the solution to the tracking HJB equation online without requiring knowledge of the system drift dynamics. That is, two neural networks (NNs), namely, actor NN and critic NN, are tuned online and simultaneously to generate the optimal bounded control policy. A simulation example is given to show the effectiveness of the proposed method.

机译：本文针对存在输入约束的确定性非线性离散时间（DT）跟踪控制问题，提出了一种部分无模型的自适应最优控制解决方案。首先将跟踪误差动力学和参考轨迹动力学结合起来以形成增强系统。然后，针对最优非线性跟踪问题，提出了一种基于增强系统的折现性能函数。与标准解决方案（分别找到控制输入的前馈项和反馈项）相比，所建议的折现性能函数的最小化同时提供了控制输入的反馈和前馈部分。这使我们能够使用非二次性能函数将输入约束编码为优化问题。推导了DT跟踪Bellman方程和跟踪Hamilton-Jacobi-Bellman（HJB）。基于行为者-批评者的强化学习算法用于在线学习跟踪HJB方程的解，而无需了解系统漂移动力学。也就是说，两个神经网络（即演员NN和评论者NN）在网络上同时进行调整，以生成最佳的有界控制策略。仿真实例表明了该方法的有效性。

著录项

来源
《Neural Networks and Learning Systems, IEEE Transactions on》 |2015年第1期|140-151|共12页
作者
Kiumarsi B.; Lewis F.L.;
展开▼
作者单位

UTA Research Institute, University of Texas at Arlington, Fort Worth, TX, USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Equations; Feedforward neural networks; Heuristic algorithms; Mathematical model; Nonlinear dynamical systems; Standards; Trajectory; Actor–critic algorithm; Actor-critic algorithm; discrete-time (DT) nonlinear optimal tracking; input constraints; neural network (NN); reinforcement learning (RL);

机译：方程;前馈神经网络;启发式算法;数学模型;非线性动力学系统;标准;轨迹;Actor-Crit算法;Actor-Crit算法;离散（DT）非线性最优跟踪;输入约束;神经网络（NN）;加固学习（RL）;

相似文献

外文文献
中文文献
专利

1. Data-driven finite-horizon optimal tracking control scheme for completely unknown discrete-time nonlinear systems [J] . Song Ruizhuo, Xie Yulong, Zhang Zenglian Neurocomputing . 2019,第SEPa3期

机译：完全未知的离散时间非线性系统的数据驱动有限水平最优跟踪控制方案
2. Data-driven finite-horizon optimal tracking control scheme for completely unknown discrete-time nonlinear systems [J] . Song Ruizhuo, Xie Yulong, Zhang Zenglian Neurocomputing . 2019,第Sepa3期

机译：完全未知离散时间非线性系统的数据驱动有限地平线最优跟踪控制方案
3. Optimal tracking control for completely unknown nonlinear discrete-time Markov jump systems using data-based reinforcement learning method [J] . Jiang He, Zhang Huaguang, Luo Yanhong, Neurocomputing . 2016,第juna19期

机译：基于数据的强化学习方法对完全未知的非线性离散时间马尔可夫跳跃系统的最优跟踪控制
4. Adaptive critic-based tracking control of non-affine nonlinear discrete-time systems with unknown dynamics [C] . Yang Qinmin, Sun Youxian 2011 Chinese Control and Decision Conference . 2011

机译：动力学未知的仿射非线性离散时间系统的基于批评家的自适应跟踪控制
5. Nonlinear ARMA models and the general tracking problem for discrete-time dynamical systems [D] . Cabrera, Joao B.D. 1997

机译：离散时间动力系统的非线性ARMA模型和一般跟踪问题
6. Singularity-Free Neural Control for the Exponential Trajectory Tracking in Multiple-Input Uncertain Systems with Unknown Deadzone Nonlinearities [O] . J. Humberto Pérez-Cruz, José de Jesús Rubio, Rodrigo Encinas, -1

机译：具有未知死区非线性的多输入不确定系统中指数轨迹跟踪的无奇异神经控制
7. MTN Optimal Tracking Control of SISO Nonlinear Time-Varying Discrete-Time Systems without Mechanism Models [O] . Jiao-Jun Zhang, Hong-Sen Yan 2018

机译：没有机制模型的SISO非线性时变离散系统的MTN最佳跟踪控制

Actor–Critic-Based Optimal Tracking for Partially Unknown Nonlinear Discrete-Time Systems

摘要

著录项

相似文献

相关主题

期刊订阅