首页> 外文会议>ASME Dynamic Systems and Control Conference >CONVERGENCE PROPERTIES OF A COMPUTATIONAL LEARNING MODEL FOR UNKNOWN MARKOV CHAINS
【24h】

CONVERGENCE PROPERTIES OF A COMPUTATIONAL LEARNING MODEL FOR UNKNOWN MARKOV CHAINS

机译:Unknown Markov链计算的计算学习模型的融合属性

获取原文

摘要

The increasing complexity of engineering systems has motivated continuing research on computational learning methods towards making autonomous intelligent systems that can learn how to improve their performance over time while interacting with their environment. These systems need not only to be able to sense their environment, but should also integrate information from the environment into all decision making. The evolution of such systems is modeled as an unknown controlled Markov chain. In previous research, the predictive optimal decision-making (POD) model was developed that aims to learn in real time the unknown transition probabilities and associated costs over a varying finite time horizon. In this paper, the convergence of POD to the stationary distribution of a Markov chain is proven, thus establishing POD as a robust model for making autonomous intelligent systems. The paper provides the conditions that POD can be valid, and an interpretation of its underlying structure.
机译:工程系统的复杂性越来越复杂地具有关于计算自动智能系统的计算方法的持续研究,这些方法可以在与环境互动时学会随着时间的推移提高他们的性能。这些系统不仅需要感知他们的环境,而且还应该将来自环境的信息集成到所有决策中。这种系统的演变被建模为一个未知的受控马尔可夫链。在先前的研究中,开发了预测最佳决策(POD)模型,旨在实时学习未知的过渡概率和相关的有限时间范围内的相关成本。在本文中,证明了POD与马尔可夫链的固定分布的收敛,从而建立了作为制作自主智能系统的鲁棒模型的POD。本文提供了POD可能有效的条件,以及对其底层结构的解释。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号