...
首页> 外文期刊>Performance evaluation review >Adaptive Discretization for Episodic Reinforcement Learning in Metric Spaces
【24h】

Adaptive Discretization for Episodic Reinforcement Learning in Metric Spaces

机译:公原空间中的焦化加固学习的自适应离散化

获取原文
获取原文并翻译 | 示例
           

摘要

We present an efficient algorithm for model-free episodic reinforcement learning on large (potentially continuous) state-action spaces. Our algorithm is based on a novel Q-learning policy with adaptive data-driven discretization. The central idea is to maintain a finer partition of the state-action space in regions which are frequently visited in historical trajectories, and have higher payoff estimates. We demonstrate how our adaptive partitions take advantage of the shape of the optimal Q-function and the joint space, without sacrificing the worst-case performance. In particular, we recover the regret guarantees of prior algorithms for continuous state-action spaces, which additionally require either an optimal discretization as input, and/or access to a simulation oracle. Moreover, experiments demonstrate how our algorithm automatically adapts to the underlying structure of the problem, resulting in much better performance compared both to heuristics and Q-learning with uniform discretization.
机译:我们提出了一种高效的算法,用于大型(潜在连续)状态行动空间的无模型情节增强学习。我们的算法基于具有自适应数据驱动的离散化的新型Q学习策略。中心思想是维持在历史轨迹中经常访问的区域中的状态动作空间的更精细分区,并且具有更高的收益估计。我们展示了我们的自适应分区如何利用最佳Q函数和联合空间的形状,而不会牺牲最坏情况性能。特别是,我们恢复了连续状态行动空间的现有算法的遗憾保证,其另外需要最佳离散化作为输入,和/或访问模拟Oracle。此外,实验证明了我们的算法如何自动适应问题的潜在结构,从而使其更好的性能与具有均匀离散化的启发式和Q学习。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号