Evolutionary reature evaluation for online Reinforcement Learning

机译：在线强化学习的进化态势评估

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Most successful examples of Reinforcement Learning (RL) report the use of carefully designed features, that is, a representation of the problem state that facilitates effective learning. The best features cannot always be known in advance, creating the need to evaluate more features than will ultimately be chosen. This paper presents Temporal Difference Feature Evaluation (TDFE), a novel approach to the problem of feature evaluation in an online RL agent. TDFE combines value function learning by temporal difference methods with an evolutionary algorithm that searches the space of feature subsets, and outputs franking over all individual features. TDFE dynamically adjusts its ranking, avoids the sample complexity multiplier of many population-based approaches, and works with arbitrary feature representations. Online learning experiments are performed in the game of Connect Four, establishing (i) that the choice of features is critical, (ii) that TDFE can evaluate and rank all the available features online, and (iii) that the ranking can be used effectively as the basis of dynamic online feature selection.

机译：强化学习（RL）的大多数成功示例都报告了精心设计的功能的使用，即，表示有助于有效学习的问题状态的表示。最好的功能不一定总是事先知道的，这就需要评估比最终选择的功能更多的功能。本文介绍了时差特征评估（TDFE），这是一种解决在线RL代理中特征评估问题的新颖方法。 TDFE将通过时差方法进行的价值函数学习与一种进化算法相结合，该算法搜索特征子集的空间，并输出对所有单个特征的盖印。 TDFE动态调整其排名，避免了许多基于人口的方法的样本复杂度乘数，并可以使用任意特征表示。在线学习实验是在“连接四人”游戏中进行的，它确定（i）功能选择至关重要，（ii）TDFE可以在线评估和评估所有可用功能，以及（iii）可以有效地使用排名作为动态在线功能选择的基础。

著录项

来源
《2013 IEEE Conference on Computatonal Intelligence in Games》|2013年|1-8|共8页
会议地点 Niagara Falls(CA)
作者
Bishop Julian; Miikkulainen Risto;
展开▼
作者单位

Department of Computer Science The University of Texas at Austin 2317 Speedway, Stop D9500, Austin, TX, USAc;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词
Connect Four; Evolutionary Algorithms; Reinforcement Learning; feature selectien; online learning;

机译：连通四；进化算法；强化学习；特征选择；在线学习;

相似文献

外文文献
中文文献
专利

1. Discrete-time online learning control for a class of unknown nonaffine nonlinear systems using reinforcement learning [J] . Xiong Yang, Derong Liu, Ding Wang, Neural Networks: The Official Journal of the International Neural Network Society . 2014,第Null期

机译：使用强化学习的一类未知非仿射非线性系统的离散时间在线学习控制
2. Discrete-time online learning control for a class of unknown nonaffine nonlinear systems using reinforcement learning [J] . Xiong Yang, Derong Liu, Ding Wang, Neural Networks: The Official Journal of the International Neural Network Society . 2014,第Null期

机译：采用加固学习的一类未知非共和非线性系统的离散时间在线学习控制
3. Online learning of shaping rewards in reinforcement learning. [J] . Grzes M, Kudenko D Neural Networks: The Official Journal of the International Neural Network Society . 2010,第4期

机译：在线学习塑造强化学习中的奖励。
4. Evolutionary reature evaluation for online Reinforcement Learning [C] . Bishop Julian, Miikkulainen Risto IEEE Conference on Computatonal Intelligence in Games . 2013

机译：在线加固学习进化腐烂评估
5. Rule -based evolutionary online learning systems: Learning bounds, classification, and prediction [D] . Butz, Martin Volker 2004

机译：基于规则的进化型在线学习系统：学习范围，分类和预测
6. Embodied Synaptic Plasticity With Online Reinforcement Learning [O] . Jacques Kaiser, Michael Hoff, Andreas Konle, 2019

机译：在线强化学习实现的突触可塑性
7. Evolutionary Feature Evaluation for Online Reinforcement Learning [O] . Julian Bishop, Risto Miikkulainen 2013

机译：在线强化学习的进化特征评估
8. Evolutionary Tile Coding: An Automated State Abstraction Algorithm for Reinforcement Learning. [R] . Wright, R., Lin, S. 2012

机译：进化瓷砖编码：一种用于强化学习的自动状态抽象算法。

Evolutionary reature evaluation for online Reinforcement Learning

摘要

著录项

相似文献

相关主题

期刊订阅