Action Selection for MDPs: Anytime AO* Versus UCT

机译：MDP的动作选择：随时AO *与UCT

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In the presence of non-admissible heuristics, A* and other best-first algorithms can be converted into anytime optimal algorithms over OR graphs, by simply continuing the search after the first solution is found. The same trick, however, does not work for best-first algorithms over AND/OR graphs, that must be able to expand leaf nodes of the explicit graph that are not necessarily part of the best partial solution. Anytime optimal variants of AO* must thus address an exploration-exploitation tradeoff: they cannot just "exploit", they must keep exploring as well. In this work, we develop one such variant of AO* and apply it to finite-horizon MDPs. This Anytime AO* algorithm eventually delivers an optimal policy while using non-admissible random heuristics that can be sampled, as when the heuristic is the cost of a base policy that can be sampled with rollouts. We then test Anytime AO* for action selection over large infinite-horizon MDPs that cannot be solved with existing off-line heuristic search and dynamic programming algorithms, and compare it with UCT.

机译：在存在不允许的试探法的情况下，只需在找到第一个解后继续进行搜索，即可将A *和其他最佳优先算法转换为基于OR图的随时最佳算法。但是，相同的技巧不适用于AND / OR图上的最佳优先算法，该算法必须能够扩展不一定是最佳部分解决方案一部分的显式图的叶节点。因此，任何时候AO *的最佳变体都必须解决勘探与开发之间的权衡问题：它们不能只是“利用”，还必须继续进行勘探。在这项工作中，我们开发了一种这样的AO *变体，并将其应用于有限水平MDP。这种Anytime AO *算法最终在使用可以采样的不可允许的随机启发式算法时提供了最佳策略，就像启发式算法是可以通过部署进行采样的基本策略的成本一样。然后，我们对随时可用的AO *进行测试，以选择大型的无限水平MDP上的动作，而现有的离线启发式搜索和动态编程算法无法解决这些动作，并将其与UCT进行比较。

著录项

来源
《》|2012年|p.1749-1755|共7页
会议地点
作者
Blai Bonet; Hector Geffner;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类人工智能理论;人工智能理论;
关键词

相似文献

外文文献
中文文献
专利

1. Intraoperative aberrometry versus preoperative biometry for intraocular lens power selection in short eyes [J] . Sudhakar Shruti, Hill Darren C., King Tonya S., Journal of cataract and refractive surgery . 2019,第6期

机译：术中的变形性与术前生物测定，用于短眼睛的眼内透镜功率选择
2. Refractive Error Using Intraoperative Aberrometry Versus Traditional Measurements for Selection of Intraocular Lens Power [J] . Zeiter John H., Hussain Farhan, Kim Chaesik, Investigative ophthalmology & visual science . 2018,第9期

机译：使用术中的变形性与传统测量的折射误差相对于选择眼内透镜功率
3. Refractive Error Using Intraoperative Aberrometry Versus Traditional Measurements for Selection of Intraocular Lens Power [J] . Zeiter John H., Hussain Farhan, Kim Chaesik, Investigative ophthalmology & visual science . 2018,第9期

机译：使用术中的变形性与传统测量的折射误差相对于选择眼内透镜功率
4. Action Selection for MDPs: Anytime AO* Versus UCT [C] . Blai Bonet, Hector Geffner Innovative applications of artificial intelligence conference . 2012

机译：MDPS的操作选择：Anytime Ao *与UCT
5. The Phonology and Phonetics of Rugao Syllable Contraction: Vowel Selection and Deletion [D] . Xu, Chenchen. 2020

机译：Rugao音节收缩的音韵和语音：元音选择和删除
6. Abstract: Adipofascial Flap Versus ADM: An Intraoperative Selection Algorithm for Implant Coverage in Immediate Breast Reconstruction [O] . Anita T. Mohan, Soyun M. Hwang, Lin Zhu, 2016

机译：摘要：脂肪面部皮瓣对ADM：术中即刻重建乳房植入物的术中选择算法
7. P981Lvot area measurement using gated ct data reclassifies aortic stenosis severity as graded by echocardiographyP982Paradoxical low-flow low-gradient aortic stenosis: an intermediate state between moderate and severe aortic stenosis?P983Can rheumatic significant mitral stenosis be a cause of paradoxical low gradient, low flow, in patients with severe aortic stenosis? an echocardiographic and outcome studyP984Clinical and hemodynamic comparison of isolated versus combined aortic and mitral stenosisP985Echocardiographic end-diastolic velocity in the proximal descending aorta should be interpreted with caution when the ascending aorta is dilated: insights from cardiovascular magnetic resonanceP987Prevalence of atrial mitral regurgitation in patients with severe mitral regurgitationP988Role of 2D/3D echocardiography in the risk stratification of endocardial lead-related tricuspid regurgitation: a single-centre study among?241 patientsP989When TEE is needed in patients with staphylococcus aureus bacteremia for the assessment of risk profile of infective endocarditis?P990Appropriateness criteria to echocardiograms for suspected infective endocarditis: experience of a tertiary referral centerP991Independent predictors of outcome in infective endocarditisP992The role of transesophageal cardiography in clinical course and prognosis of complicated infective endocarditis in critically ill patients: our 15 years experienceP993Left bundle branch block atypical pattern as a prognostic determinant in patients taken to TAVIP994Efficacy of long-term ivabradine therapy in severe systolic chronic heart failure patients with and without type 2 diabetes mellitusP995Relations between left ventricular reverse remodeling and serum markers of extracellular matrix fibrosis in dilated cardiomyopathyP996The healthy left ventricle accommodates an increasing vortex formation time for volume transfer in diastolic filling :Implications for heart failureP997Evolutionary changes of pulmonary artery pressure after left ventricular assist device implantP998Functional correlates and prognostic value of coronary flow velocity reserve by vasodilator stress echocardiography in hypertrophic cardiomyopathyP999Quantification of myocardial performance in patients with non-obstructive versus latent-obstructive hypertrophic cardiomyopathyP1000Lifelong arrhythmic risk stratification in arrhythmogenic right ventricular cardiomyopathy: distribution of events and impact of periodical reassessmentP1001Impact of fibrosis visualized by CMR in vectorcardiogram recordings of patients with suspected arrhythmogenic cardiomyopathyP1002Determinants of the beneficial effect of aldosterone antagonism on exercise capacity in heart failure with reduced ejection fractionP1003Myocardial strain values in patients with acute myocarditis and preserved ejection fraction. A magnetic resonance feature tracking studyP1004Detection of subclinical left ventricular dysfunction by speckle tracking echocardiography in patients with myocarditis without prominent wall motion abnormalitiesP1005Aborted sudden cardiac death patients aged <50 years show only mild alterations on cardiac magnetic resonance imagingP1006Relationships between subepicardial and subendocardial longitudinal strain with late gadolinium enhancement in uncomplicated hypertensive patients [O] . L. Moderato, C. Di Nora, A. Soufiani, 2016

机译：P981LVOT区域测量使用门控CT数据重新分类主动脉狭窄的严重程度，以超声心动图7982分类为分类，如二醇的低流量低梯度主动脉狭窄：中度和严重主动脉狭窄之间的中间状态？P983CAN风湿显着二尖瓣狭窄是矛盾的低梯度，低流量的原因在严重主动脉狭窄的患者中？超声心动图和结合分离的主动脉和二尖瓣术和二尖瓣狭窄的血液动力学比较的超声心动图和血液动力学比较在近期下降主动脉中应当谨慎地解释升高的主动脉：从心血管磁共振的洞察中的心血管磁共振PREValence在严重的患者中的洞察中解释二尖瓣regurgitationP988 rool 2D / 3D超声心动图在内膜内铅相关三尖瓣反流的风险分层：241例患者中的单一学习，在葡萄球菌的患者中需要TEE，用于评估感染性心内炎的风险概况？P990姑息度标准怀疑感染心内膜炎的超声心动图：第三节推荐中心的经验，感染endocardisp992在感染性Endocardisap999中的临床过程中的作用和复杂感染的预后的作用生病患者的心内膜炎：我们的15年经验训练束分支块的非典型模式作为患者的预后决定因素，以TaviP994患者在严重的收缩期慢性心力衰竭患者中患者，无型糖尿病患者左心室反向重塑和血清基质纤维化的血清标志物在扩张心肌脑肿瘤中，健康的左心室容纳舒张填充中体积转移的增加的涡旋形成时间：对左心室辅助装置Implantp998函数相关和冠状动脉速率储备的肺动脉压的肺动脉压的影响。血管扩张器应力超声心动图在肥厚性心肌病型499中，非阻塞性患者心肌表现与潜在阻塞性肥厚性心肌病的患者患者患者患者患者血小板治疗1000Lifelong心律失常风险Strati心律病学右心室心肌病的发动机：CMR患者血管瘤术治疗患者血管动脉瘤患者血管诊断患者血管心目记录中CMR的纤维化术治疗的事件和影响患有急性心肌炎和保存的喷射分数。磁共振特征跟踪STOPYP1004DETTECTECTECTET通过突出壁运动患者的斑点左心室功能障碍的亚临床左心室功能障碍，没有突出的壁运动异常，P1005aborted突发的心脏死亡患者<50岁的突然性心脏死亡患者只显示心脏磁共振术中的轻度改变，钆和肾外腺纵向应变之间的心脏磁共振成像P1006相关性简单的高血压患者增强

Action Selection for MDPs: Anytime AO* Versus UCT

摘要

著录项

相似文献

相关主题

期刊订阅