Adaptive Discretization for Episodic Reinforcement Learning in Metric Spaces

Sean R. Sinclair; Siddhartha Banerjee; Christina Lee Yu

首页> 外文期刊>Performance evaluation review >Adaptive Discretization for Episodic Reinforcement Learning in Metric Spaces

【24h】

Adaptive Discretization for Episodic Reinforcement Learning in Metric Spaces

机译：公原空间中的焦化加固学习的自适应离散化

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We present an efficient algorithm for model-free episodic reinforcement learning on large (potentially continuous) state-action spaces. Our algorithm is based on a novel Q-learning policy with adaptive data-driven discretization. The central idea is to maintain a finer partition of the state-action space in regions which are frequently visited in historical trajectories, and have higher payoff estimates. We demonstrate how our adaptive partitions take advantage of the shape of the optimal Q-function and the joint space, without sacrificing the worst-case performance. In particular, we recover the regret guarantees of prior algorithms for continuous state-action spaces, which additionally require either an optimal discretization as input, and/or access to a simulation oracle. Moreover, experiments demonstrate how our algorithm automatically adapts to the underlying structure of the problem, resulting in much better performance compared both to heuristics and Q-learning with uniform discretization.

机译：我们提出了一种高效的算法，用于大型（潜在连续）状态行动空间的无模型情节增强学习。我们的算法基于具有自适应数据驱动的离散化的新型Q学习策略。中心思想是维持在历史轨迹中经常访问的区域中的状态动作空间的更精细分区，并且具有更高的收益估计。我们展示了我们的自适应分区如何利用最佳Q函数和联合空间的形状，而不会牺牲最坏情况性能。特别是，我们恢复了连续状态行动空间的现有算法的遗憾保证，其另外需要最佳离散化作为输入，和/或访问模拟Oracle。此外，实验证明了我们的算法如何自动适应问题的潜在结构，从而使其更好的性能与具有均匀离散化的启发式和Q学习。

著录项

来源
《Performance evaluation review》 |2020年第1期|17-18|共2页
作者
Sean R. Sinclair; Siddhartha Banerjee; Christina Lee Yu;
展开▼
作者单位

Cornell University USA;

Cornell University USA;

Cornell University USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
reinforcement learning; metric spaces; adaptive discretization; Q-learning; model-free;

机译：强化学习;公制空间;自适应离散化;Q-Learning;无模型;

相似文献

外文文献
中文文献
专利

1. Model-free Adaptive Optimal Control of Episodic Fixed-horizon Manufacturing Processes Using Reinforcement Learning [J] . International Journal of Control, Automation, and Systems . 2020,第6期

机译：使用加强学习的无型固定地平线制造工艺的无模型自适应最优控制
2. Adaptive Fault-Tolerant Tracking Control for MIMO Discrete-Time Systems via Reinforcement Learning Algorithm With Less Learning Parameters [J] . Lei Liu, Zhanshan Wang, Huaguang Zhang Automation Science and Engineering, IEEE Transactions on . 2017,第1期

机译：通过具有较少学习参数的强化学习算法对MIMO离散时间系统进行自适应容错跟踪控制
3. Reinforcement Learning Design-Based Adaptive Tracking Control With Less Learning Parameters for Nonlinear Discrete-Time MIMO Systems [J] . Liu Y., Tang L., Tong S., Neural Networks and Learning Systems, IEEE Transactions on . 2015,第1期

机译：非线性离散MIMO系统中基于强化学习设计且学习参数较少的自适应跟踪控制
4. Transductive Episodic-Wise Adaptive Metric for Few-Shot Learning [C] . Limeng Qiao, Yemin Shi, Jia Li, International Conference on Computer Vision . 2019

机译：少有学习的转导式情节自适应度量
5. Discrete approximations of metric measure spaces with controlled geometry. [D] . Lopez, Marcos D. 2015

机译：具有受控几何形状的度量度量空间的离散近似值。
6. Action-specialized expert ensemble trading system with extended discrete action space using deep reinforcement learning [O] . JoonBum Leem, Ha Young Kim, Baogui Xin, 2020

机译：采用深度加固学习采用延长离散动作空间的行动专业专业专家集合交易系统
7. Adaptive Discretization for Episodic Reinforcement Learning in Metric Spaces [O] . Sean R. Sinclair, Siddhartha Banerjee, Christina Lee Yu 2020

机译：公原空间中的焦化加固学习的自适应离散化

Adaptive Discretization for Episodic Reinforcement Learning in Metric Spaces

摘要

著录项

相似文献

相关主题

期刊订阅