首页> 外国专利> Computerized determination of control strategy for technical system involves using reinforcement learning to determine control strategy for each state and learn optimal actions

Computerized determination of control strategy for technical system involves using reinforcement learning to determine control strategy for each state and learn optimal actions

机译:计算机确定技术系统的控制策略涉及使用强化学习来确定每种状态的控制策略并学习最佳操作

摘要

The method involves describing the system using a continuous state space and an action space. A state change assessment is performed and a model of the technical system is determined using training data describing the system by forming fuzzy association functions. The functions are fed to a reinforcement learning method with which a control strategy is determined for each state in state space and the optimal actions in action space are learned. The method involves describing the system using a continuous state space and an action space, whereby the state space has states that the technical system can adopt and the action space has actions that are carried out to produce a state change from a previous state to a subsequent state in state space. An assessment of the state change is performed and a model of the technical system is determined using training data describing the system by forming fuzzy association functions with which at least the state space state space is described. The fuzzy association functions are fed to a reinforcement learning method with which a control strategy is determined for each state in state space and the optimal actions in action space are learned. Independent claims are also included for the following: a fuzzy controller for determining a control strategy for a technical system and a computer-readable storage medium.
机译:该方法涉及使用连续状态空间和动作空间来描述系统。使用状态描述评估,并通过形成模糊关联函数使用描述系统的训练数据来确定技术系统的模型。这些功能被馈送到强化学习方法,通过该方法可以为状态空间中的每个状态确定控制策略,并学习动作空间中的最佳动作。该方法包括使用连续状态空间和动作空间来描述系统,其中状态空间具有技术系统可以采用的状态,并且动作空间具有执行以产生从先前状态到后续状态的状态变化的动作。状态空间中的状态。进行状态变化的评估,并使用训练数据描述技术系统的模型,该训练数据通过形成模糊关联函数来描述系统,该模糊关联函数至少描述状态空间状态空间。模糊关联函数被输入到强化学习方法中,通过该方法可以为状态空间中的每个状态确定控制策略,并学习动作空间中的最佳动作。还包括以下独立权利要求:用于确定技术系统的控制策略的模糊控制器和计算机可读存储介质。

著录项

  • 公开/公告号DE10021929A1

    专利类型

  • 公开/公告日2001-11-15

    原文格式PDF

  • 申请/专利权人 SIEMENS AG;

    申请/专利号DE2000121929

  • 发明设计人 APPL MARTIN;

    申请日2000-05-05

  • 分类号G05B13/02;G05B17/00;G06N7/02;

  • 国家 DE

  • 入库时间 2022-08-22 00:27:47

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号