On-Line Learning with Imperfect Monitoring

机译：具有不完善监控的在线学习

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

We study on-line play of repeated matrix games in which the observations of past actions of the other player and the obtained reward are partial and stochastic. We define the Partial Observation Bayes Envelope (POBE) as the best reward against the worst-case stationary strategy of the opponent that agrees with past observations. Our goal is to have the (unobserved) average reward above the POBE. For the case where the observations (but not necessarily the rewards) depend on the opponent play alone, an algorithm for attaining the POBE is derived. This algorithm is based on an application of approachability theory combined with a worst-case view over the unobserved rewards. We also suggest a simplified solution concept for general signaling structure. This concept may fall short of the POBE.

机译：我们研究重复矩阵游戏的在线游戏，在该游戏中，对其他玩家过去的行为和获得的奖励的观察是部分随机的。我们将部分观察贝叶斯信封（POBE）定义为对付与过去观察一致的对手的最坏情况固定策略的最佳奖励。我们的目标是使（未观察到的）平均奖励高于POBE。对于观察结果（但不一定是奖励）取决于对手单独玩游戏的情况，推导了用于获得POBE的算法。该算法基于可接近性理论的应用，结合了对未观察到的奖励的最坏情况视图。我们还建议了一种通用信号结构的简化解决方案概念。这个概念可能不符合POBE。

著录项

来源
《16th Annual Conference on Learning Theory and 7th Kernel Workshop, COLT/Kernel 2003 Aug 24-27, 2003 Washington, DC, USA》|2003年|p.552-566|共15页
会议地点 Washington DC(US);Washington DC(US);Washington DC(US);Washington DC(US);Washington DC(US);Washington DC(US)
作者
Shie Mannor; Nahum Shimkin;
展开▼
作者单位

Laboratory for Information and Decision Systems Massachusetts Institute of Technology, Cambridge, MA 02139;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Boundedly rational quasi-Bayesian learning in coordination games with imperfect monitoring [J] . Chen HC, Chow Y, Hsieh J Journal of Applied Probability . 2006,第2期

机译：监测不完善的协调博弈中的有限理性拟贝叶斯学习
2. On-line monitoring of egg freshness using a portable NIR spectrometer in tandem with machine learning [J] . Cruz-Tirado J. P., da Silva Medeiros Maria Lucimar, Barbin Douglas Fernandes Journal of food engineering . 2021,第Octa期

机译：用机器学习中的便携式NIR光谱仪在线监测蛋清新鲜度
3. Use of Machine Learning with On-Line Monitoring Systems for Reprocessing [J] . Benjamin B. Cipiti, Nathan T. Shoman Transactions of the American nuclear society . 2019,第Juna期

机译：将机器学习与在线监控系统一起用于后处理
4. On-Line Learning with Imperfect Monitoring [C] . Shie Mannor, Nahum Shimkin, Lecture Notes in Artificial Intelligence 2777 Annual Conference on Learning Theory . 2003

机译：与不完美监控的在线学习
5. On-line learning and wavelet-based feature extraction methodology for process monitoring using high-dimensional functional data. [D] . Omitaomu, Olufemi Abayomi. 2006

机译：使用高维功能数据进行过程监控的在线学习和基于小波的特征提取方法。
6. Development and Validation of an On-Line Water Toxicity Sensor with Immobilized Luminescent Bacteria for On-Line Surface Water Monitoring [O] . Marjolijn Woutersen, Bram van der Gaag, Afua Abrafi Boakye, 2017

机译：用于地面水在线监测的带有固定化发光细菌的在线水毒性传感器的开发和验证
7. On-line learning with imperfect monitoring [O] . Shie Mannor, Nahum Shimkin 2003

机译：具有不完善监控的在线学习

On-Line Learning with Imperfect Monitoring

摘要

著录项

相似文献

相关主题

期刊订阅