Randomized allocation with nonparametric estimation for contextual multi-armed bandits with delayed rewards

Arya Sakshi; Yang Yuhong

首页> 外文期刊>Statistics & Probability Letters >Randomized allocation with nonparametric estimation for contextual multi-armed bandits with delayed rewards

【24h】

Randomized allocation with nonparametric estimation for contextual multi-armed bandits with delayed rewards

机译：随机分配与延迟奖励的上下文多武装匪徒的非参数分配

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We study a multi-armed bandit problem with covariates in a setting where there is a possible delay in observing the rewards. Under some reasonable assumptions on the probability distributions for the delays and using an appropriate randomization to select the arms, the proposed strategy is shown to be strongly consistent. (C) 2020 Elsevier B.V. All rights reserved.

著录项

来源
《Statistics & Probability Letters》 |2020年第1期|共9页
作者
Arya Sakshi; Yang Yuhong;
展开▼
作者单位

Univ Minnesota Sch Stat Ford Hall Church St SE Minneapolis MN 55455 USA;

Univ Minnesota Sch Stat Ford Hall Church St SE Minneapolis MN 55455 USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类概率论（几率论、或然率论）;
关键词
Multi-armed bandit with covariates; Delayed rewards; Histogram method; Strong consistency;

机译：多武装强盗与协变量;延迟奖励;直方图法;强持续性;

相似文献

外文文献
中文文献
专利

1. Randomized allocation with nonparametric estimation for contextual multi-armed bandits with delayed rewards [J] . Arya Sakshi, Yang Yuhong Statistics & Probability Letters . 2020,第1期

机译：随机分配与延迟奖励的上下文多武装匪徒的非参数分配
2. Randomized allocation with nonparametric estimation for a multi-armed bandit problem with covariates [J] . Yang YH., Zhu D. The Annals of Statistics: An Official Journal of the Institute of Mathematical Statistics . 2002,第1期

机译：具有协变量的多臂匪问题的具有非参数估计的随机分配
3. A numerical analysis of allocation strategies for the multi-armed bandit problem under delayed rewards conditions in digital campaign management [J] . Martin Miguel, Jimenez-Martin Antonio, Mateos Alfonso Neurocomputing . 2019,第Octa21期

机译：数字战役管理中延迟奖励条件下多臂匪问题分配策略的数值分析
4. Contextual Multi-armed Bandit Algorithm for Semiparametric Reward Model [C] . Gi-Soo Kim, Myunghee Cho Paik International Conference on Machine Learning . 2019

机译：半射频奖励模型的上下文多武装强盗算法
5. Contextual Bandits with Delayed Feedback Using Randomized Allocation [D] . ?Arya, Sakshi 2020

机译：使用随机分配具有延迟反馈的上下文匪徒
6. Smoking and the bandit: A preliminary study of smoker and non-smoker differences in exploratory behavior measured with a multi-armed bandit task [O] . Merideth A. Addicott, John M. Pearson, Jessica Wilson, -1

机译：吸烟和强盗：用多武装强盗任务测量的探索性行为的吸烟者和非吸烟者差异的初步研究
7. The Multi-Armed Bandit Problem under Delayed Rewards Conditions in Digital Campaign Management [O] . M. Martin, A. Jimenez-Martin, A. Mateos 2019

机译：数字竞选管理中延迟奖励条件下的多武装强盗问题

Randomized allocation with nonparametric estimation for contextual multi-armed bandits with delayed rewards

摘要

著录项

相似文献

相关主题

期刊订阅