GrDHP: A General Utility Function Representation for Dual Heuristic Dynamic Programming

Ni Z.; He H.; Zhao D.; Xu X.; Prokhorov D.V.

首页> 外文期刊>Neural Networks and Learning Systems, IEEE Transactions on >GrDHP: A General Utility Function Representation for Dual Heuristic Dynamic Programming

【24h】

GrDHP: A General Utility Function Representation for Dual Heuristic Dynamic Programming

机译：GrDHP：双重启发式动态规划的通用效用函数表示

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

A general utility function representation is proposed to provide the required derivable and adjustable utility function for the dual heuristic dynamic programming (DHP) design. Goal representation DHP (GrDHP) is presented with a goal network being on top of the traditional DHP design. This goal network provides a general mapping between the system states and the derivatives of the utility function. With this proposed architecture, we can obtain the required derivatives of the utility function directly from the goal network. In addition, instead of a fixed predefined utility function in literature, we conduct an online learning process for the goal network so that the derivatives of the utility function can be adaptively tuned over time. We provide the control performance of both the proposed GrDHP and the traditional DHP approaches under the same environment and parameter settings. The statistical simulation results and the snapshot of the system variables are presented to demonstrate the improved learning and controlling performance. We also apply both approaches to a power system example to further demonstrate the control capabilities of the GrDHP approach.

机译：提出了一种通用效用函数表示法，为双重启发式动态规划（DHP）设计提供所需的可推导和可调效用函数。目标表示DHP（GrDHP）的目标网络位于传统DHP设计之上。该目标网络提供了系统状态与效用函数的导数之间的一般映射。利用这种提议的体系结构，我们可以直接从目标网络获得效用函数的所需导数。另外，代替文献中固定的预定义效用函数，我们对目标网络进行在线学习，以便可以随时间自适应地调整效用函数的派生。我们在相同的环境和参数设置下提供了建议的GrDHP和传统DHP方法的控制性能。给出了统计仿真结果和系统变量的快照，以演示改进的学习和控制性能。我们还将这两种方法都应用于电力系统示例，以进一步证明GrDHP方法的控制能力。

著录项

来源
《Neural Networks and Learning Systems, IEEE Transactions on》 |2015年第3期|614-627|共14页
作者
Ni Z.; He H.; Zhao D.; Xu X.; Prokhorov D.V.;
展开▼
作者单位

Department of Electrical, Computer and Biomedical Engineering, University of Rhode Island, Kingston, RI, USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Approximation methods; Density estimation robust algorithm; Dynamic programming; Learning systems; Neural networks; Nickel; Power system dynamics; Adaptive control; adaptive dynamic programming (ADP); dual heuristic dynamic programming (DHP); general utility function; goal representation; reinforcement learning (RL); reinforcement learning (RL).;

机译：近似方法;密度估计鲁棒算法;动态规划;学习系统;神经网络;镍;电力系统动力学;自适应控制;自适应动态规划（ADP）;双重启发式动态规划（DHP）;通用效用函数;目标表示;强化学习（RL）;强化学习（RL）。;

相似文献

外文文献
中文文献
专利

1. A Theoretical Foundation of Goal Representation Heuristic Dynamic Programming [J] . Xiangnan Zhong, Zhen Ni, Haibo He Neural Networks and Learning Systems, IEEE Transactions on . 2016,第12期

机译：目标表示启发式动态规划的理论基础
2. Goal Representation Heuristic Dynamic Programming on Maze Navigation [J] . Ni Z., He H., Wen J., Neural Networks and Learning Systems, IEEE Transactions on . 2013,第12期

机译：迷宫导航目标表示启发式动态规划
3. Heuristic dynamic programming with internal goal representation [J] . Ni Z., He H. Soft computing: A fusion of foundations, methodologies and applications . 2013,第11期

机译：具有内部目标表示的启发式动态规划
4. Comparison of a heuristic dynamic programming and a dual heuristic programming based adaptive critics neurocontroller for a turbogenerator [C] . Venayagamoorthy, G.K., Harley, . 2000

机译：汽轮发电机启发式动态规划与基于双重启发式规划的自适应评论神经控制器的比较
5. Sequential frameworks for statistics-based value function representation in approximate dynamic programming [D] . Fan, Huiyuan 2008

机译：近似动态编程中基于统计的值函数表示的顺序框架
6. The portable UNIX programming system (PUPS) and CANTOR: a computational environment for dynamical representation and analysis of complex neurobiological data. [O] . M A ONeill, C C Hilgetag 2001

机译：便携式UNIX编程系统（PUPS）和CANTOR：动态表示和分析复杂神经生物学数据的计算环境。
7. Comparison of a Heuristic Dynamic Programming and a Dual Heuristic Programming Based Adaptive Critics Neurocontroller for a Turbogenerator [O] . Ganesh K Venayagamoorthy Mieee, Ronald G Harley Fieee, Donald C Wunsch I Smiee 2013

机译：汽轮发电机启发式动态规划与基于双重启发式规划的自适应批评神经控制器的比较
8. Expected Utility, Penalty Functions, and Duality in Stochastic Nonlinear Programming. Revised [R] . Ben-Tal, A., Teboulle, M. 1985

机译：随机非线性规划中的期望效用，罚函数和对偶性。修订

GrDHP: A General Utility Function Representation for Dual Heuristic Dynamic Programming

摘要

著录项

相似文献

相关主题

期刊订阅