首页> 外文学位 >Multistage decisions and risk in Markov decision processes: Towards effective approximate dynamic programming architectures.

【24h】

Multistage decisions and risk in Markov decision processes: Towards effective approximate dynamic programming architectures.

机译：马尔可夫决策过程中的多阶段决策和风险：建立有效的近似动态编程体系结构。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

The scientific domain of this thesis is optimization under uncertainty for discrete event stochastic systems. In particular, this thesis focuses on the practical implementation of the Dynamic Programming (DP) methodology to discrete event stochastic systems. Unfortunately DP in its crude form suffers from three severe computational obstacles that make its implementation to such systems an impossible task. This thesis addresses these obstacles by developing and executing practical Approximate Dynamic Programming (ADP) techniques.;Specifically, for the purposes of this thesis we developed the following ADP techniques. The first one is inspired from the Reinforcement Learning (RL) literature and is termed as Real Time Approximate Dynamic Programming (RTADP). The RTADP algorithm is meant for active learning while operating the stochastic system. The basic idea is that the agent while constantly interacts with the uncertain environment accumulates experience, which enables him to react more optimal in future similar situations. While the second one is an off-line ADP procedure. Both approaches are developed for discrete event stochastic systems and their main focus is the controlled exploration of the state space circumventing in such a way one of the severe computational obstacles of DP that is related with the cardinality of the state space.;These ADP techniques are demonstrated on a variety of discrete event stochastic systems such as: (i) a three stage queuing manufacturing network with recycle, (ii) a supply chain of the light aromatics of a typical refinery and (iii) several stochastic shortest path instances with a single starting and terminal state.;Moreover, this work addresses, in a systematic way, the issue of multistage risk within the DP framework by exploring the usage of single-period and multi-period risk sensitive utility functions. In this thesis we propose a special structure for a single-period utility and compare the derived policies in several multistage instances. Finally, we briefly attempt to intergrade the developed ADP procedures with the proposed utility to yield ADP risk sensitive policies.

机译：本文的科学领域是离散事件随机系统在不确定性下的优化。特别是，本文着重于动态规划（DP）方法在离散事件随机系统中的实际实现。不幸的是，原始形式的DP遭受三个严重的计算障碍，这使得将其实现到此类系统成为不可能的任务。本文通过开发和执行实用的近似动态编程（ADP）技术来解决这些障碍。具体而言，针对本文的目的，我们开发了以下ADP技术。第一个灵感来自强化学习（RL）文献，被称为实时近似动态编程（RTADP）。 RTADP算法用于在操作随机系统时进行主动学习。基本思想是，代理在不断与不确定的环境交互时会积累经验，这使他能够在未来的类似情况下做出更好的反应。第二个是离线ADP程序。两种方法都是针对离散事件随机系统开发的，其主要重点是对状态空间的受控探索，其方式是DP的严重计算障碍之一，它与状态空间的基数有关。在各种离散事件随机系统上进行了证明，例如：（i）具有回收利用的三阶段排队制造网络，（ii）典型炼油厂的轻质芳烃的供应链，以及（iii）单个具有多个随机最短路径实例此外，该工作通过探索单期和多期风险敏感效用函数的使用，以系统的方式解决了DP框架内的多阶段风险问题。在本文中，我们为单周期实用程序提出了一种特殊的结构，并比较了多个多阶段实例中的派生策略。最后，我们简要地尝试将已开发的ADP程序与建议的实用程序进行转换，以产生ADP风险敏感策略。

著录项

作者
Pratikakis, Nikolaos E.;
展开▼
作者单位

Georgia Institute of Technology.;

展开▼
授予单位 Georgia Institute of Technology.;
学科 Engineering Chemical.;Engineering Industrial.;Operations Research.
学位 Ph.D.
年度 2009
页码 215 p.
总页数 215
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. New approximate dynamic programming algorithms for large-scale undiscounted Markov decision processes and their application to optimize a production and distribution system [J] . Ohno Katsuhisa, Boh Toshitaka, Nakade Koichi, European Journal of Operational Research . 2016,第1期

机译：用于大规模无折扣马尔可夫决策过程的新的近似动态规划算法及其在优化生产和分销系统中的应用
2. Risk-averse dynamic programming for Markov decision processes [J] . Ruszczyński A. Mathematical Programming . 2010,第2期

机译：马尔可夫决策过程的规避风险动态规划
3. Risk-averse dynamic programming for Markov decision processes [J] . Andrzej Ruszczyński Mathematical Programming . 2010,第2期

机译：马尔可夫决策过程的规避风险动态规划
4. Approximate Dynamic Programming with (min; +) linear function approximation for Markov decision processes [C] . Chandrashekar L., Bhatnagar Shalabh IEEE Annual Conference on Decision and Control . 2014

机译：马尔可夫决策过程的（最小; +）线性函数近似的近似动态规划
5. Markov Decision Processes and Approximate Dynamic Programming Methods for Optimal Treatment Design [D] . Mason, Jennifer Elizabeth 2012

机译：马尔可夫决策过程和近似动态规划方法进行最优处理设计
6. Composition of Web Services Using Markov Decision Processes and Dynamic Programming [O] . Víctor Uc-Cetina, Francisco Moo-Mena, Rafael Hernandez-Ucan 2015

机译：使用Markov决策过程和动态规划的Web服务组合
7. Approximate dynamic programming with $(\min,+)$ linear function approximation for Markov decision processes [O] . Lakshminarayanan, Chandrashekar, Bhatnagar, Shalabh 2014

机译：使用$（\ min，+）$线性函数进行近似动态编程马尔可夫决策过程的近似

Multistage decisions and risk in Markov decision processes: Towards effective approximate dynamic programming architectures.

摘要

著录项

相似文献

相关主题

期刊订阅