首页> 外国专利> Hybrid estimation of transition probability values in markov decision processes

Hybrid estimation of transition probability values in markov decision processes

机译:马氏决策过程中转移概率值的混合估计

摘要

According to some embodiments of the present invention there is provided a method for determining a control action in a control system using a Markov decision process. The method comprises an action of receiving measured transition probability values of a Markov decision process (MDP) and receiving simulated transition probability values generated by performing a control system simulation. New transition probability values are computed by calculating a measured data count of some of the sensor measurements and a simulated data count of some of the simulated transition data. New transition probability values are computed from a weighted average between the measured transition probability values and the simulated transition probability values using the measured data count and the simulated data count. A new control action is determined based on the one or more new transition probability value.
机译:根据本发明的一些实施例,提供了一种用于使用马尔可夫决策过程来确定控制系统中的控制动作的方法。该方法包括以下动作:接收测得的马尔可夫决策过程(MDP)的转移概率值,并接收通过执行控制系统仿真而生成的模拟转移概率值。通过计算某些传感器测量值的测量数据计数和某些模拟转换数据的模拟数据计数,可以计算出新的转换概率值。使用测得的数据计数和模拟的数据计数,从测得的转换概率值和模拟的转换概率值之间的加权平均值计算出新的转换概率值。基于一个或多个新的转移概率值来确定新的控制动作。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号