Continuous-Time Markov Decision Processes with Unbounded Transition and Discounted-Reward Rates

Hao Yan; Junyu Zhang; Xianping Guo

首页> 外文期刊>Stochastic Analysis and Applications >Continuous-Time Markov Decision Processes with Unbounded Transition and Discounted-Reward Rates

【24h】

Continuous-Time Markov Decision Processes with Unbounded Transition and Discounted-Reward Rates

机译：具有无界转移和折扣率的连续时间马尔可夫决策过程

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this article, we study continuous-time Markov decision processes in Polish spaces. The optimality criterion to be maximized is the expected discounted criterion. The transition rates may be unbounded, and the reward rates may have neither upper nor lower bounds. We provide conditions on the controlled system's primitive data under which we prove that the transition functions of possibly non-homogeneous continuous-time Markov processes are regular by using Feller's construction approach to such transition functions. Then, under continuity and compactness conditions we prove the existence of optimal stationary policies by using the technique of extended infinitesimal operators associated with the transition functions of possibly non-homogeneous continuous-time Markov processes, and also provide a recursive way to compute (or at least to approximate) the optimal reward values. The conditions provided in this paper are different from those used in the previous literature, and they are illustrated with an example.

机译：在本文中，我们研究波兰空间中的连续时间马尔可夫决策过程。要最大化的最优准则是预期的折现准则。过渡率可能没有限制，奖励率可能没有上限也没有下限。我们提供了关于受控系统原始数据的条件，在这些条件下，我们通过使用Feller的构造函数来证明可能的非均匀连续时间马尔可夫过程的转移函数是规则的。然后，在连续性和紧致性条件下，我们使用与可能的非均匀连续时间马尔可夫过程的转移函数相关联的扩展无穷小算子的技术，证明了最优平稳策略的存在，并且还提供了一种递归方式来计算（或在至少近似）最佳奖励值。本文提供的条件与先前文献中使用的条件不同，并以一个示例进行说明。

著录项

来源
《Stochastic Analysis and Applications》 |2008年第2期|共23页
作者
Hao Yan; Junyu Zhang; Xianping Guo;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类概率论与数理统计;
关键词
Discounted reward criterion; General state space; Optimal stationary policy; Q-process;

机译：折扣奖励准则;一般状态空间;最优平稳策略;Q过程;

相似文献

外文文献
中文文献
专利

1. Continuous-Time Markov Decision Processes with Unbounded Transition and Discounted-Reward Rates [J] . Hao Yan, Junyu Zhang, Xianping Guo Stochastic Analysis and Applications . 2008,第2期

机译：具有无界转移和折扣率的连续时间马尔可夫决策过程
2. FINITE-HORIZON OPTIMALITY FOR CONTINUOUS-TIME MARKOV DECISION PROCESSES WITH UNBOUNDED TRANSITION RATES [J] . Guo Xianping, Huang Xiangxiang, Huang Yonghui Advances in applied probability . 2015,第4期

机译：具有无界转换率的连续时间马尔可夫决策过程的有限水平最优性
3. Discounted continuous-time markov decision processes with constraints: Unbounded transition and loss rates [J] . Guo X., Piunovskiy A. Mathematics of operations research . 2011,第1期

机译：具有约束条件的折扣连续时间马尔科夫决策过程：无限制的过渡和损失率
4. Optimal Control of Discounted-Reward Markov Decision Processes Under Linear Temporal Logic Specifications [C] . Krishna C. Kalagarla, Rahul Jain, Pierluigi Nuzzo Annual American Control Conference . 2021

机译：线性时间逻辑规范下折扣奖励马尔可夫决策过程的最佳控制
5. Investigation of Computational Reduction Strategies for Markov Decision Processes [D] . Zhai, Jie. 2019

机译：马尔可夫决策过程计算减排策略调查
6. Learning to maximize reward rate: a model based on semi-Markov decision processes [O] . Arash Khodadadi, Pegah Fakhari, Jerome R. Busemeyer 2014

机译：学习最大化奖励率：基于半马尔可夫决策过程的模型
7. Finite horizon risk-sensitive continuous-time Markov decision processes with unbounded transition and cost rates [O] . Xin Guo, Qiuli Liu, Yi Zhang 2019

机译：有限地平线风险敏感的连续时间马尔可夫决策流程，具有无限的过渡和成本率

Continuous-Time Markov Decision Processes with Unbounded Transition and Discounted-Reward Rates

摘要

著录项

相似文献

相关主题

期刊订阅