...
首页> 外文期刊>Journal of control science and engineering >Combining Multiple Strategies for Multiarmed Bandit Problems and Asymptotic Optimality
【24h】

Combining Multiple Strategies for Multiarmed Bandit Problems and Asymptotic Optimality

机译:结合多种策略解决多臂匪问题和渐近最优性

获取原文
获取原文并翻译 | 示例
           

摘要

This brief paper provides a simple algorithm that selects a strategy at each time in a given set of multiple strategies for stochastic multiarmed bandit problems, thereby playing the arm by the chosen strategy at each time. The algorithm follows the idea of the probabilistic e,-switching in the e,-greedy strategy and is asymptotically optimal in the sense that the selected strategy converges to the best in the set under some conditions on the strategies in the set and the sequence of {ε_t}.
机译:这篇简短的文章提供了一种简单的算法,它可以在给定的多个策略中随机选择每次策略,以解决随机多臂匪徒问题,从而每次都通过所选策略发挥作用。该算法遵循e-贪婪策略中的概率e,-切换的思想,并且在选择的策略在集合中的策略和序列的某些条件下在集合中收敛到最佳的意义上是渐近最优的。 {ε_t}。

著录项

  • 来源
    《Journal of control science and engineering》 |2015年第2015期|264953.1-264953.7|共7页
  • 作者

    Hyeong Soo Chang; Sanghee Choe;

  • 作者单位

    Department of Computer Science and Engineering, Sogang University, Seoul 121-742, Republic of Korea;

    Department of Computer Science and Engineering, Sogang University, Seoul 121-742, Republic of Korea;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号