Symbolic Dynamic Programming for Continuous State and Action MDPs

机译：连续状态和动作MDP的符号动态编程

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Many real-world decision-theoretic planning problems are naturally modeled using both continuous state and action (CSA) spaces, yet little work has provided exact solutions for the case of continuous actions. In this work, we propose a symbolic dynamic programming (SDP) solution to obtain the optimal closed-form value function and policy for CSA-MDPs with mul-tivariate continuous state and actions, discrete noise, piecewise linear dynamics, and piecewise linear (or restricted piecewise quadratic) reward. Our key contribution over previous SDP work is to show how the continuous action maximization step in the dynamic programming backup can be evaluated optimally and symbolically - a task which amounts to symbolic constrained optimization subject to unknown state parameters; we further integrate this technique to work with an efficient and compact data structure for SDP - the extended algebraic decision diagram (XADD). We demonstrate empirical results on a didactic nonlinear planning example and two domains from operations research to show the. first automated exact solution to these problems.

机译：许多现实世界中的决策理论规划问题都是使用连续状态和动作（CSA）空间自然建模的，但是很少有工作为连续动作提供精确的解决方案。在这项工作中，我们提出了一种符号动态规划（SDP）解决方案，以获得具有多变量连续状态和动作，离散噪声，分段线性动力学和分段线性（或限制分段二次）奖励。我们对以前的SDP工作的主要贡献是，展示了如何以最佳方式和符号方式评估动态编程备份中的连续动作最大化步骤-这项任务相当于在未知状态参数的情况下进行符号约束优化;我们进一步集成了此技术，以与SDP的高效紧凑数据结构一起使用-扩展代数决策图（XADD）。我们通过一个有说服力的非线性规划实例和运筹学的两个领域展示了经验结果。首先自动解决这些问题。

著录项

来源
《IAAI-12;Innovative applications of artificial intelligence conference;AAAI conference on artificial intelligence;Symposium on educational advances in artificial intelligence;AAAI-12;EAAI-12》|2012年|p.1839-1845|共7页
会议地点
作者
Zahra Zamani; Scott Sanner; Cheng Fang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类人工智能理论;人工智能理论;
关键词

相似文献

外文文献
中文文献
专利

1. A Universal Empirical Dynamic Programming Algorithm for Continuous State MDPs [J] . Haskell William B., Jain Rahul, Sharma Hiteshi, IEEE Transactions on Automatic Control . 2020,第1期

机译：连续状态MDP的通用经验动态规划算法
2. Cross Entropy Optimization of Action Modification Policies for Continuous-Valued MDPs ? [J] . Kamelia Mirkamali, Lucian Bu?oniu IFAC PapersOnLine . 2020,第2期

机译：跨熵优化用于连续值MDPS的动作修改策略？
3. Actions of symbolic dynamical systems on C*-algebras. II. Simplicity of C*-symbolic crossed products and some examples [J] . Kengo Matsumoto Mathematische Zeitschrift . 2010,第4期

机译：符号动力学系统对C *代数的作用。二。 C *符号交叉积的简单性和一些示例
4. Symbolic Dynamic Programming for Continuous State and Action MDPs [C] . Zahra Zamani, Scott Sanner, Cheng Fang Innovative applications of artificial intelligence conference . 2012

机译：连续状态和动作MDPS的符号动态规划
5. Automatic test case generation with dynamic symbolic execution for programs that are coded against interfaces and annotations or use native code [D] . Islam, Mainul. 2015

机译：自动测试用例生成具有用于编码接口和注释或使用本机代码编码的程序的动态符号执行
6. Joint symbolic dynamics for the assessment of cardiovascular and cardiorespiratory interactions [O] . Mathias Baumert, Michal Javorka, Muammar Kabir -1

机译：用于评估心血管和心肺相互作用的联合符号动力学
7. Extending the Bellman equation for MDPs to continuous actions and continuous time in the discounted case [O] . Rachelson Emmanuel, Garcia Frédérick, Fabiani Patrick 2008

机译：在打折情况下将MDP的Bellman方程扩展为连续动作和连续时间

Symbolic Dynamic Programming for Continuous State and Action MDPs

摘要

著录项

相似文献

相关主题

期刊订阅