Thompson Sampling for Stochastic Control: The Finite Parameter Case

Michael Jong Kim

首页> 外文期刊>Automatic Control, IEEE Transactions on >Thompson Sampling for Stochastic Control: The Finite Parameter Case

【24h】

Thompson Sampling for Stochastic Control: The Finite Parameter Case

机译：用于随机控制的汤普森采样：有限参数情况

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we apply Thompson sampling to a class of average reward stochastic control problems with parameter uncertainty. Specifically, we study an average reward stochastic control problem over an infinite horizon in which both the reward and state transition distributions are parameterized by an unknown parameter taking values in a finite space. The main result of this paper is a proof showing that Thompson sampling achieves a worst case average per period regret of O(T-1), which is asymptotically optimal.

机译：在本文中，我们将汤普森采样应用于一类具有参数不确定性的平均奖励随机控制问题。具体来说，我们研究了无限范围内的平均奖励随机控制问题，其中奖励和状态转换分布均由在有限空间中取值的未知参数进行参数化。本文的主要结果是证明汤普森采样达到了每周期后悔O（T -1 ）的最坏情况平均，这是渐近最优的。

著录项

来源
《Automatic Control, IEEE Transactions on》 |2017年第12期|6415-6422|共8页
作者
Michael Jong Kim;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Aerospace electronics; Convergence; Markov processes; Entropy; Bayes methods; Dynamic programming; Stochastic processes;

机译：航空电子;收敛性;马尔可夫过程;熵;贝叶斯方法;动态规划;随机过程;

相似文献

外文文献
中文文献
专利

1. Thompson Sampling for Stochastic Control: The Continuous Parameter Case [J] . Banjevic Dragan, Kim Michael Jong IEEE Transactions on Automatic Control . 2019,第10期

机译：汤普森抽样随机控制：连续参数的情况
2. Lag synchronization of complex dynamical networks with randomly occurring parameter uncertainties and control packet loss via stochastic sampled-data control [J] . International journal of systems science . 2019,第13a16期

机译：具有随机出现的参数不确定性和通过随机采样数据控制来控制数据包丢失的复杂动态网络的滞后同步
3. Hybrid-driven finite-time H_∞ sampling synchronization control for coupling memory complex networks with stochastic cyber attacks [J] . Neurocomputing . 2020,第Apra28期

机译：混合驱动的有限时间H_∞采样同步控制，用于将复杂内存网络与随机网络攻击耦合
4. Thompson Sampling-Based Heterogeneous Network Selection Considering Stochastic Geometry Analysis [C] . Wangdong Deng, Shotaro Kamiya, Koji Yamamoto, IEEE Annual Consumer Communications Networking Conference . 2020

机译：考虑随机几何分析的基于汤普森采样的异构网络选择
5. Thompson Sampling for the Control of a Queue with Demand Uncertainty [D] . Gimelfarb, Michael. 2017

机译：汤普森采样控制需求不确定的队列
6. Improved Horvitz-Thompson Estimation of Model Parameters from Two-phase Stratified Samples: Applications in Epidemiology [O] . Norman E. Breslow, Thomas Lumley, Christie M Ballantyne, -1

机译：从两相分层样品改进了Horvitz-Thompson估计模型参数：流行病学的应用
7. Thompson Sampling Guided Stochastic Searching on the Line for Adversarial Learning [O] . Glimsdal, Sondre, Granmo, Ole-Christoffer 2015

机译：汤普森抽样指导随机搜索在线进行对抗性学习

Thompson Sampling for Stochastic Control: The Finite Parameter Case

摘要

著录项

相似文献

相关主题

期刊订阅