首页> 外文会议>AAAI Conference on Artificial Intelligence >Unsupervised Cross-Domain Transfer in Policy Gradient Reinforcement Learning via Manifold Alignment

【24h】

Unsupervised Cross-Domain Transfer in Policy Gradient Reinforcement Learning via Manifold Alignment

机译：通过歧管对准进行政策梯度加固中的无监督跨域转移

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The success of applying policy gradient reinforcement learning (RL) to difficult control tasks hinges crucially on the ability to determine a sensible initialization for the policy. Transfer learning methods tackle this problem by reusing knowledge gleaned from solving other related tasks. In the case of multiple task domains, these algorithms require an inter-task mapping to facilitate knowledge transfer across domains. However, there are currently no general methods to learn an inter-task mapping without requiring either background knowledge that is not typically present in RL settings, or an expensive analysis of an exponential number of inter-task mappings in the size of the state and action spaces. This paper introduces an autonomous framework that uses unsupervised manifold alignment to learn inter-task mappings and effectively transfer samples between different task domains. Empirical results on diverse dynamical systems, including an application to quadrotor control, demonstrate its effectiveness for cross-domain transfer in the context of policy gradient RL.

机译：将政策梯度加强学习（RL）应用于困难控制权的成功至关重要地对确定该政策的明智初始化的能力。转移学习方法通过重用从解决其他相关任务的知识来解决这个问题。在多个任务域的情况下，这些算法需要任务间映射，以促进跨域的知识传输。然而，目前没有一般方法来学习任务间映射，而无需在RL设置中通常存在的背景知识，或者在状态和动作的大小中对指数映射的指数数量的昂贵分析空间。本文介绍了一种自主框架，它使用无监督的歧管对齐来学习任务间映射，并有效地传输不同任务域之间的样本。在不同动态系统上的经验结果，包括对四轮压力机控制的应用，证明了在政策梯度RL的背景下的跨域转移的有效性。

著录项

来源
《AAAI Conference on Artificial Intelligence》|2015年||共7页
会议地点
作者
Haitham Bou Ammar; Eric Eaton; Paul Ruvolo; Matthew E. Taylor;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP18-53;
关键词

相似文献

外文文献
中文文献
专利

1. Unsupervised Manifold Alignment for Cross-Domain Classification of Remote Sensing Images [J] . Ma Li, Luo Chuang, Peng Jiangtao, IEEE Geoscience and Remote Sensing Letters . 2019,第10期

机译：遥感影像跨域分类的无监督歧管对准
2. An Algorithm of Policy Gradient Reinforcement Learning with a Fuzzy Controller in Policies [J] . Harukazu Igarashi, Seiji Ishihara International Journal of Artificial Intelligence and Expert Systems (IJAE) . 2013,第1期

机译：策略中带有模糊控制器的策略梯度强化学习算法
3. Deep reinforcement learning collision avoidance using policy gradient optimisation and Q-learning [J] . Shady A. Maged, Bishoy H. Mikhail International journal of computational vision and robotics . 2020,第3期

机译：使用政策梯度优化和Q-Learning避免深增强学习碰撞
4. Unsupervised Cross-Domain Transfer in Policy Gradient Reinforcement Learning via Manifold Alignment [C] . Haitham Bou Ammar, Eric Eaton, Paul Ruvolo, AAAI Conference on Artificial Intelligence . 2015

机译：通过歧管对准进行政策梯度加固中的无监督跨域转移
5. A geometric framework for transfer learning using manifold alignment. [D] . Wang, Chang. 2010

机译：使用流形对齐进行转移学习的几何框架。
6. Correction: Spike-Based Reinforcement Learning in Continuous State and Action Space: When Policy Gradient Methods Fail [O] . Eleni Vasilaki, Nicolas Frémaux, Robert Urbanczik, 2009

机译：更正：在连续状态和动作空间中基于峰值的强化学习：当策略梯度方法失败时
7. Unsupervised Reinforcement Learning of Transferable Meta-Skills for Embodied Navigation [O] . Juncheng Li, Xin Wang, Siliang Tang, 2020

机译：无监督的增强学习可转让的Meta-Chillation为体现导航

Unsupervised Cross-Domain Transfer in Policy Gradient Reinforcement Learning via Manifold Alignment

摘要

著录项

相似文献

相关主题

期刊订阅