Planning in Factored Action Spaces with Symbolic Dynamic Programming

机译：使用符号动态规划在因式动作空间中进行规划

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We consider symbolic dynamic programming (SDP) for solving Markov Decision Processes (MDP) with factored state and action spaces, where both states and actions are described by sets of discrete variables. Prior work on SDP has considered only the case of factored states and ignored structure in the action space, causing them to scale poorly in terms of the number of action variables. Our main contribution is to present the first SDP-based planning algorithm for leveraging both state and action space structure in order to compute compactly represented value functions and policies. Since our new algorithm can potentially require more space than when action structure is ignored, our second contribution is to describe an approach for smoothly trading-off space versus time via recursive conditioning. Finally, our third contribution is to introduce a novel SDP approximation that often significantly reduces planning time with little loss in quality by exploiting action structure in weakly coupled MDPs. We present empirical results in three domains with factored action spaces that show that our algorithms scale much better with the number of action variables as compared to state-of-the-art SDP algorithms.

机译：我们考虑符号动态规划（SDP）来解决带因果关系状态和动作空间的马尔可夫决策过程（MDP），其中状态和动作都由离散变量集来描述。关于SDP的先前工作仅考虑了分解状态的情况，而忽略了操作空间中的结构，从而导致它们在操作变量的数量方面的伸缩性很差。我们的主要贡献是提出第一个基于SDP的计划算法，以利用状态和动作空间结构来计算紧凑表示的价值函数和策略。由于我们的新算法可能比忽略动作结构时需要更多的空间，因此我们的第二个贡献是描述了一种通过递归条件平稳地权衡时间与空间的方法。最后，我们的第三项贡献是引入一种新颖的SDP近似值，该近似值通常通过利用弱耦合MDP中的动作结构来显着减少计划时间，而质量损失很少。我们在三个带因果作用空间的领域中给出了经验结果，这些结果表明，与最新的SDP算法相比，我们的算法在具有作用变量的情况下伸缩性更好。

著录项

来源
《IAAI-12;Innovative applications of artificial intelligence conference;AAAI conference on artificial intelligence;Symposium on educational advances in artificial intelligence;AAAI-12;EAAI-12》|2012年|p.1802-1808|共7页
会议地点
作者
Aswin Raghavan; Saket Joshi; Alan Fern; Prasad Tadepalli; Roni Khardon;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类人工智能理论;人工智能理论;
关键词

相似文献

外文文献
中文文献
专利

1. Actions of symbolic dynamical systems on C*-algebras. II. Simplicity of C*-symbolic crossed products and some examples [J] . Kengo Matsumoto Mathematische Zeitschrift . 2010,第4期

机译：符号动力学系统对C *代数的作用。二。 C *符号交叉积的简单性和一些示例
2. Actions of symbolic dynamical systems on C*-algebras. II. Simplicity of C*-symbolic crossed products and some examples [J] . Matsumoto K. Mathematische Zeitschrift . 2010,第4期

机译：符号动力学系统对C *代数的作用。二。 C *-符号交叉积的简单性和示例
3. Generalized Dual Dynamic Programming for Infinite Horizon Problems in Continuous State and Action Spaces [J] . Warrington Joseph, Beuchat Paul N., Lygeros John IEEE Transactions on Automatic Control . 2019,第12期

机译：连续状态和行动空间中无限地平问题的广义双动脉
4. Planning in Factored Action Spaces with Symbolic Dynamic Programming [C] . Aswin Raghavan, Saket Joshi, Alan Fern, Innovative applications of artificial intelligence conference . 2012

机译：符号动态规划的非经事空间计划
5. Avoiding State-Space Explosion in Multithreaded Programs with Input-Covering Schedules and Symbolic Execution. [D] . Bergan, Thomas. 2014

机译：在具有输入覆盖计划和符号执行的多线程程序中避免状态空间爆炸。
6. Factors associated with the job satisfaction of certified nurses and nurse specialists in cancer care in Japan: Analysis based on the Basic Plan to Promote Cancer Control Programs [O] . Masaki Kitajima, Chiharu Miyata, Keiko Tamura, 2020

机译：与日本癌症护理的经过认证护士和护士专家的工作满意度相关的因素：基于促进癌症控制计划的基础计划的分析
7. Dynamic Programming in Reduced Dimensional Spaces: Dynamic Planning for Robust Biped Locomotion [O] . Stilman, Mike, Atkeson, Christopher G., Kuffner, James J., 2005

机译：降维空间中的动态编程：健壮的Biped运动的动态规划
8. Sensitivity Analysis in Nonlinear Programming Using Factorable Symbolic Input. [R] . de Silva, A., McCormick, G. P. 1978

机译：基于可分解符号输入的非线性规划灵敏度分析。

Planning in Factored Action Spaces with Symbolic Dynamic Programming

摘要

著录项

相似文献

相关主题

期刊订阅