Representations of Decision-Theoretic Planning Tasks

机译：决策理论计划任务的表示

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Goal-directed Markov Decision Process models (GDMDPs) are good models for many decision-theoretic planning tasks. They have been used in conjunction with two different reward structures, namely the goal-reward representation and the action-penalty representation. We apply GDMDPs to planning tasks in the presence of traps such as steep slopes for outdoor robots or staircases for indoor robots, and study the differences between the two reward structures. In these situations, achieving the goal is often the primary objective while minimizing the travel time is only of secondary importance. We show that the action-penalty representation without discounting guarantees that the optimal plan achieves the goal for sure (if this is possible) but neither the action-penalty representation with discounting nor the goal-reward representation with discounting have this property. We then show exactly when this trapping phenomenon occurs, using a novel interpretation for discounting, namely that it models agents that use convex exponential utility functions and thus are optimistic in the face of uncertainty. Finally, we show how the trapping phenomenon can be eliminated with our Selective State-Deletion Method.

机译：目标导向的马尔可夫决策过程模型（GDMDP）是许多决策理论计划任务的良好模型。它们已与两种不同的奖励结构结合使用，即目标奖励表述和行动惩罚表述。我们将GDMDP应用到存在陷阱的计划任务中，例如室外机器人的陡坡或室内机器人的楼梯，并研究这两种奖励结构之间的差异。在这些情况下，实现目标通常是主要目标，而最大限度地减少旅行时间仅是次要的。我们表明，没有折扣的动作惩罚表示保证了最优计划可以肯定地实现目标（如果可能的话），但是带有折扣的动作惩罚表示和带有折扣的目标奖励表示都不具有此属性。然后，我们使用折现的新颖解释来确切地说明何时出现这种陷阱现象，即，它对使用凸指数效用函数的代理进行建模，从而面对不确定性时保持乐观。最后，我们展示了如何使用选择性状态删除方法消除陷阱现象。

著录项

来源
《International Conference on Artificial Intelligence Planning and Scheduling; 2000414-17; Breckenridge,CO(US)》|2000年|P.187-195|共9页
会议地点 BreckenridgeCO(US)
作者
Sven Koenig; Yaxin Liu;
展开▼
作者单位

College of Computing, Georgia Institute of Technology Atlanta, Georgia 30332-0280;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类人工智能理论;
关键词

相似文献

外文文献
中文文献
专利

1. The interaction of representations and planning objectives for decision-theoretic planning tasks [J] . SVEN KOENIG, YAXIN LIU Journal of Experimental and Theoretical Artificial Intelligence . 2002,第4期

机译：决策理论计划任务中表示和计划目标的交互
2. Neural Task Planning With AND–OR Graph Representations [J] . Chen Tianshui, Chen Riquan, Nie Lin, IEEE transactions on multimedia . 2019,第4期

机译：具有AND-OR图表示的神经任务计划
3. Learning to guide task and motion planning using score-space representation [J] . Kim Beomjoon, Wang Zi, Kaelbling Leslie Pack, The International journal of robotics research . 2019,第7期

机译：学习使用分数空间表示法指导任务和动作计划
4. Representations of decision-theoretic planning tasks [C] . Sven Koenig, Yaxin Liu International Conference on Artificial Intelligence Planning and Scheduling . 2000

机译：决策理论规划任务的代表
5. Decision-theoretic planning under risk-sensitive planning objectives. [D] . Liu, Yaxin. 2005

机译：风险敏感计划目标下的决策理论计划。
6. Decision-theoretic refinement planning: a new method for clinical decision analysis. [O] . A. Doan, P. Haddawy, C. E. Kahn Jr 1995

机译：决策理论优化计划：一种用于临床决策分析的新方法。
7. The interaction of representations and planning objectives for decision-theoretic planning tasks [O] . Sven Koenig, Yaxin Liu 2002

机译：决策理论计划任务的表示与计划目标的交互
8. Transfer Learning and Hierarchical Task Network Representations and Planning [R] . Munoz-Avila, H. 2008

机译：转移学习和分层任务网络表示和规划

Representations of Decision-Theoretic Planning Tasks

摘要

著录项

相似文献

相关主题

期刊订阅