首页> 美国政府科技报告 >Countable State Discounted Markovian Decision Processes with Unbounded Rewards

【24h】

Countable State Discounted Markovian Decision Processes with Unbounded Rewards

机译：具有无限奖励的可数州折现马尔可夫决策过程

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Countable state, finite action Markov decision processes are investigated under a criterion of maximizing expected discounted rewards over an infinite planning horizon. Well-known results of Maitra and Blackwell are generalized, their assumption of bounded rewards being replaced by the following weaker condition: the expected absolute reward to be received at time n+1 minus the actual absolute reward received at time n (as a function of the state of the system of time n, the action taken at time n, and the decision rule to be followed at time n+1) can be bounded above. Under this condition it is shown that the expected discounted reward (over the infinite planning horizon) from each policy is finite and that there exists a stationary policy which is optimal. Additional results are presented concerning the policy improvement and successive approximations algorithms for computation of optimal policies. All of these results are extended to Markov renewal decision processes under one additional condition on the transition time distributions. As in Blackwell's work on discounted dynamic programming a central role is played by Banach's fixed point theorem for contraction mappings. Examples are presented of inventory and queueing control problems which satisfy our assumptions but do not exhibit bounded rewards. (Author)

著录项

作者
Harrison, J. M.;
展开▼
作者单位

展开▼
年度 1970
页码 p.1-36
总页数 36
原文格式 PDF
正文语种 eng
中图分类
关键词
Dynamic programming ; Decision theory ; Stochastic processes ; Management planning ; Probability density functions ; Time series analysis ; Queueing theory ; Inventory control ; Algorithms;

机译：动态规划;决策理论;随机过程;管理规划;概率密度函数;时间序列分析;排队论;库存控制;算法;

相似文献

外文文献
中文文献
专利

1. COUNTABLE STATE MARKOV DECISION PROCESSES WITH UNBOUNDED JUMP RATES AND DISCOUNTED COST: OPTIMALITY EQUATION AND APPROXIMATIONS [J] . Blok H., Spieksma F. M. Advances in applied probability . 2015,第4期

机译：跳率无限制且成本折扣的可数状态马尔可夫决策过程：最优性方程和逼近
2. Markov decision processes with state-dependent discount factors and unbounded rewards/costs [J] . Wei Q., Guo X. Operations Research Letters: A Journal of the Operations Research Society of America . 2011,第5期

机译：马尔可夫决策过程具有与状态相关的折扣因子和无限制的报酬/成本
3. Continuous-Time Markov Decision Processes with Unbounded Transition and Discounted-Reward Rates [J] . Hao Yan, Junyu Zhang, Xianping Guo Stochastic Analysis and Applications . 2008,第2期

机译：具有无界转移和折扣率的连续时间马尔可夫决策过程
4. Markov Decision Processes with Discounted Rewards: New Action Elimination Procedure [C] . Abdellatif Semmouri, Mostafa Jourhmane, Bahaa Eddine Elbaghazaoui International Conference on Business Intelligence . 2021

机译：马尔可夫决定流程，折扣奖励：新行动消除程序
5. Regret-based reward elicitation for Markov decision processes. [D] . Kevin, Regan. 2014

机译：基于后悔的马尔可夫决策过程的奖励启发。
6. The Abused Inhalant Toluene Impairs Medial Prefrontal Cortex Activity and Risk/Reward Decision-Making during a Probabilistic Discounting Task [O] . Kevin M. Braunscheidel, Michael P. Okas, Michaela Hoffman, 2019

机译：在概率贴现任务中滥用的吸入甲苯会损害内侧前额叶皮层活动和风险/奖励决策。
7. Finite state approximations for denumerable state infinite horizon discounted Markov decision processes with unbounded rewards [O] . White D.J 1982

机译：具有无穷大奖励的可数状态无限视野折现马尔可夫决策过程的有限状态近似
8. A solution to a countable system of equations arising in Markovian decision processes Technical report no. 89 [R] . Derman, C., Veinott, A. F., Jr. 1966

机译：马尔可夫决策过程中可计算的方程组的解决方案技术报告no。 89

Countable State Discounted Markovian Decision Processes with Unbounded Rewards

摘要

著录项

相似文献

相关主题

期刊订阅