Low-Variance and Zero-Variance Baselines for Extensive-Form Games

机译：用于广泛形式游戏的低方差和零方差基线

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Extensive-form games (EFGs) are a common model of multi-agent interactions with imperfect information. State-of-the-art algorithms for solving these games typically perform full walks of the game tree that can prove prohibitively slow in large games. Alternatively, sampling-based methods such as Monte Carlo Counterfactual Regret Minimization walk one or more trajectories through the tree, touching only a fraction of the nodes on each iteration, at the expense of requiring more iterations to converge due to the variance of sampled values. In this paper, we extend recent work that uses baseline estimates to reduce this variance. We introduce a framework of baseline-corrected values in EFGs that generalizes the previous work. Within our framework, we propose new baseline functions that result in significantly reduced variance compared to existing techniques. We show that one particular choice of such a function - predictive baseline - is provably optimal under certain sampling schemes. This allows for efficient computation of zero-variance value estimates even along sampled trajectories.

机译：广泛形式的游戏（EFGS）是与不完美信息的多代理交互的共同模型。用于解决这些游戏的最先进的算法通常会对游戏树进行全面散步，这些游戏树可以证明在大型游戏中可以证明是速度的。或者，基于采样的方法，如蒙特卡罗反事实遗员最小化，通过树步行一个或多个轨迹，仅在每次迭代中触摸节点的一小部分，以牺牲由于采样值的方差而需要更多的迭代来收敛。在本文中，我们延长了最近的工作，使用基线估计来减少这种方差。我们在概括上一个工作的EFG中介绍了基线纠正的值的框架。在我们的框架内，我们提出了与现有技术相比显着降低的方差的新基线函数。我们展示了在某些采样方案下可透明地优化这种功能 - 预测基线的一个特殊选择。这允许即使沿着采样的轨迹，也允许高效计算零方差值估计。

著录项

来源
《International Conference on Machine Learning》|2021年|2344-3125p|共10页
会议地点
作者
Trevor Davis; Martin Schmid; Michael Bowling;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP181-53;
关键词

相似文献

外文文献
中文文献
专利

1. Prudent Rationalizability in Generalized Extensive-form Games with Unawareness [J] . Heifetz Aviad, Meier Martin, Schipper Burkhard C. The BE Journal of Theoretical Economics . 2021,第2期

机译：广义广泛形式游戏的谨慎合理化，具有不明智
2. Automated construction of bounded-loss imperfect-recall abstractions in extensive-form games [J] . Jin Cermak, Viliam Lisy, Branislav Bosansky Artificial intelligence . 2020,第May期

机译：自动构建广泛形式游戏中的有限损失不完全召回抽象
3. Strategic negotiations for extensive-form games [J] . Dave de Jonge, Dongmo Zhang Autonomous agents and multi-agent systems . 2020,第1期

机译：广泛形式游戏的战略谈判
4. Low-Variance and Zero-Variance Baselines for Extensive-Form Games [C] . Trevor Davis, Martin Schmid, Michael Bowling International Conference on Machine Learning . 2021

机译：用于广泛形式游戏的低方差和零方差基线
5. Three Essays on Extensive-form Games and Strategic Complements. [D] . Feng, Yue. 2017

机译：关于广泛形式博弈和战略互补的三篇论文。
6. Baseline characteristics in laparoscopic simulator performance: The impact of personal computer (PC)–gaming experience and visuospatial ability [O] . Ninos Oussi, Petra Renman, Konstantinos Georgiou, 2021

机译：腹腔镜模拟器性能的基线特性：个人计算机（PC） - 明显体验和探测能力的影响
7. Polynomial Graphs With Applications To Graphical Games, Extensive-Form Games, and Games With Emergent Node Tree Structures [O] . Datta, Ruchira S. 2006

机译：多项式图应用于图形游戏，广泛形式具有紧急节点树结构的游戏和游戏

Low-Variance and Zero-Variance Baselines for Extensive-Form Games

摘要

著录项

相似文献

相关主题

期刊订阅