首页>
外国专利>
SYSTEM AND METHOD FOR ROBUST OPTIMIZATION FOR TRAJECTORY-CENTRIC MODEL-BASED REINFORCEMENT LEARNING
SYSTEM AND METHOD FOR ROBUST OPTIMIZATION FOR TRAJECTORY-CENTRIC MODEL-BASED REINFORCEMENT LEARNING
展开▼
机译:基于轨迹的基于模型的增强学习的鲁棒优化系统和方法
展开▼
页面导航
摘要
著录项
相似文献
摘要
A controller for optimizing a local control policy of a system for trajectory-centric reinforcement learning is provided. The controller includes performing steps of learning a stochastic predictive model for the system using a set of data collected during trial and error experiments performed using an initial random control policy, estimating mean prediction and uncertainty associated, determining a local set of deviations of the system using the learned stochastic system model, from a nominal system state upon use of a control input at a current time-step, determining a system state with a worst-case deviation, determining a gradient of the robustness constraint, providing and solving a robust policy optimization problem using non-linear programming to obtain system trajectory and stabilizing local policy simultaneously, updating the control data according to the solved optimization problem, and output the updated control data via the interface.
展开▼