Combining Multiple Strategies for Multiarmed Bandit Problems and Asymptotic Optimality

Hyeong Soo Chang; Sanghee Choe

首页> 外文期刊>Journal of control science and engineering >Combining Multiple Strategies for Multiarmed Bandit Problems and Asymptotic Optimality

【24h】

Combining Multiple Strategies for Multiarmed Bandit Problems and Asymptotic Optimality

机译：结合多种策略解决多臂匪问题和渐近最优性

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This brief paper provides a simple algorithm that selects a strategy at each time in a given set of multiple strategies for stochastic multiarmed bandit problems, thereby playing the arm by the chosen strategy at each time. The algorithm follows the idea of the probabilistic e,-switching in the e,-greedy strategy and is asymptotically optimal in the sense that the selected strategy converges to the best in the set under some conditions on the strategies in the set and the sequence of {ε_t}.

机译：这篇简短的文章提供了一种简单的算法，它可以在给定的多个策略中随机选择每次策略，以解决随机多臂匪徒问题，从而每次都通过所选策略发挥作用。该算法遵循e-贪婪策略中的概率e，-切换的思想，并且在选择的策略在集合中的策略和序列的某些条件下在集合中收敛到最佳的意义上是渐近最优的。 {ε_t}。

著录项

来源
《Journal of control science and engineering》 |2015年第2015期|264953.1-264953.7|共7页
作者
Hyeong Soo Chang; Sanghee Choe;
展开▼
作者单位

Department of Computer Science and Engineering, Sogang University, Seoul 121-742, Republic of Korea;

Department of Computer Science and Engineering, Sogang University, Seoul 121-742, Republic of Korea;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Combining Multiple Strategies for Multiarmed Bandit Problems and Asymptotic Optimality [J] . Hyeong SooChang, SangheeChoe Journal of control science and engineering . 2015,第1期

机译：多臂匪问题和渐近最优性的多种策略组合
2. An asymptotically optimal policy for finite support models in the multiarmed bandit problem [J] . Junya Honda, Akimichi Takemura Machine Learning . 2011,第3期

机译：多臂土匪问题有限支持模型的渐近最优策略。
3. Asymptotically efficient adaptive allocation rules for the multiarmed bandit problem with switching cost [J] . Agrawal R., Hedge M.V. IEEE Transactions on Automatic Control . 1988,第10期

机译：具有切换费用的多武装强盗问题的渐近有效自适应分配规则。
4. Optimality of Myopic Policy for Restless Multiarmed Bandit with Imperfect Observation [C] . Kehao Wang IEEE Global Communications Conference . 2016

机译：具有不完美观察的不安多臂强盗近视策略的最优性
5. Optimal Decision Rule for Combining Multiple Biomarkers into Tree-Based Classifier and Its Evaluation [D] . Zhu, Yuxin 2018

机译：基于树的分类器中多个生物标志物组合的最优决策规则及其评价
6. A Low-Complexity and Asymptotically Optimal Coding Strategy for Gaussian Vector Sources [O] . Marta Zárraga-Rodríguez, Jesús Gutiérrez-Gutiérrez, Xabier Insausti 2019

机译：高斯矢量源的低复杂性和渐近最优编码策略
7. An asymptotically optimal policy for finite support models in the multiarmed bandit problem [O] . Junya Honda, Akimichi Takemura 2010

机译：多臂土匪问题有限支持模型的渐近最优策略。

Combining Multiple Strategies for Multiarmed Bandit Problems and Asymptotic Optimality

摘要

著录项

相似文献

相关主题

期刊订阅