Informed Initial Policies for Learning in Dec-POMDPs

机译：Dec-POMDP中知情的初步学习政策

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We have presented a simple, principled technique to compute a valid Dec-POMDP policy for use as an initial policy in conjunction with reinforcement learning. Furthermore, we have demonstrated how to learn such policies in a model-free manner, and we have shown for two benchmark problems that using these initial policies can improve the outcome of alternating Q-learning. This result is encouraging and suggests that this initial policy may be useful in other algorithms (both existing and future) that might require an initial policy.

机译：我们提出了一种简单的，有原则的技术来计算有效的Dec-POMDP策略，以与强化学习一起用作初始策略。此外，我们演示了如何以无模型的方式学习此类策略，并且针对两个基准问题显示了使用这些初始策略可以改善交替Q学习的结果。该结果令人鼓舞，并表明该初始策略可能在可能需要初始策略的其他算法（现有算法和将来算法）中很有用。

著录项

来源
《IAAI-12;Innovative applications of artificial intelligence conference;AAAI conference on artificial intelligence;Symposium on educational advances in artificial intelligence;AAAI-12;EAAI-12》|2012年|p.2433-2434|共2页
会议地点
作者
Landon Kraemer; Bikramjit Banerjee;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类人工智能理论;人工智能理论;
关键词

相似文献

外文文献
中文文献
专利

1. Reinforcement Learning of Informed Initial Policies for Decentralized Planning [J] . LANDON KRAEMER, BIKRAMJIT BANERJEE ACM transactions on autonomous and adaptive systems . 2015,第4期

机译：加强对分散计划的知情初始政策的学习
2. My Specialization Course on Health Policy Management Informed by Evidence (ESPIE) travel: how active methodologies stimulate significant learning in health policy management [J] . Maiadanic Alma Nina Cohn, Nordi Aline Barreto de Almeida Interface: Comunicacao, Saude, Educacao . 2018,第66期

机译：我的循证医学卫生政策管理专业课程（ESPIE）：有效的方法如何激发卫生政策管理中的重要学习
3. Decentralized learning of energy optimal production policies using PLC-informed reinforcement learning [J] . Dorothea Schwung, Steve Yuwono, Andreas Schwung, Computers & Chemical Engineering . 2021,第Sepa期

机译：分散学习能源最优生产政策使用PLC知识的强化学习
4. Informed Initial Policies for Learning in Dec-POMDPs [C] . Landon Kraemer, Bikramjit Banerjee Innovative applications of artificial intelligence conference . 2012

机译：告知DEC-POMDPS学习的初步政策
5. How Informed Is "Trauma-Informed"? The Voices of Black Male Principals in Urban High Schools Concerning Trauma-Informed School Policy [D] . Williams, Amber Audria. 2020

机译：如何通知是“创伤信息”？关于创伤学校政策的城市高中黑男校长的声音
6. A Policy Analysis on the Proactive Prevention of Chronic Disease: Learnings from the Initial Implementation of Integrated Measurement for Early Detection (MIDO) [O] . Roberto Tapia-Conyer, Rodrigo Saucedo-Martínez, Ricardo Mújica-Rosales, 2017

机译：积极预防慢性病的政策分析：从早期检测综合测量（MIDO）的初步实施中吸取的教训
7. Stick-breaking policy learning in Dec-POMDPs [O] . Amato Christopher, Liao Xuejun, Carin Lawrence, 2015

机译：12月-pOmDp中坚持不懈的政策学习

Informed Initial Policies for Learning in Dec-POMDPs

摘要

著录项

相似文献

相关主题

期刊订阅