首页> 外文期刊>Stochastic Analysis and Applications >Continuous-Time Markov Decision Processes with Unbounded Transition and Discounted-Reward Rates
【24h】

Continuous-Time Markov Decision Processes with Unbounded Transition and Discounted-Reward Rates

机译:具有无界转移和折扣率的连续时间马尔可夫决策过程

获取原文
获取原文并翻译 | 示例
           

摘要

In this article, we study continuous-time Markov decision processes in Polish spaces. The optimality criterion to be maximized is the expected discounted criterion. The transition rates may be unbounded, and the reward rates may have neither upper nor lower bounds. We provide conditions on the controlled system's primitive data under which we prove that the transition functions of possibly non-homogeneous continuous-time Markov processes are regular by using Feller's construction approach to such transition functions. Then, under continuity and compactness conditions we prove the existence of optimal stationary policies by using the technique of extended infinitesimal operators associated with the transition functions of possibly non-homogeneous continuous-time Markov processes, and also provide a recursive way to compute (or at least to approximate) the optimal reward values. The conditions provided in this paper are different from those used in the previous literature, and they are illustrated with an example.
机译:在本文中,我们研究波兰空间中的连续时间马尔可夫决策过程。要最大化的最优准则是预期的折现准则。过渡率可能没有限制,奖励率可能没有上限也没有下限。我们提供了关于受控系统原始数据的条件,在这些条件下,我们通过使用Feller的构造函数来证明可能的非均匀连续时间马尔可夫过程的转移函数是规则的。然后,在连续性和紧致性条件下,我们使用与可能的非均匀连续时间马尔可夫过程的转移函数相关联的扩展无穷小算子的技术,证明了最优平稳策略的存在,并且还提供了一种递归方式来计算(或在至少近似)最佳奖励值。本文提供的条件与先前文献中使用的条件不同,并以一个示例进行说明。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号