首页> 外文会议>AAAI Conference on Artificial Intelligence >Unsupervised Cross-Domain Transfer in Policy Gradient Reinforcement Learning via Manifold Alignment
【24h】

Unsupervised Cross-Domain Transfer in Policy Gradient Reinforcement Learning via Manifold Alignment

机译:通过歧管对准进行政策梯度加固中的无监督跨域转移

获取原文

摘要

The success of applying policy gradient reinforcement learning (RL) to difficult control tasks hinges crucially on the ability to determine a sensible initialization for the policy. Transfer learning methods tackle this problem by reusing knowledge gleaned from solving other related tasks. In the case of multiple task domains, these algorithms require an inter-task mapping to facilitate knowledge transfer across domains. However, there are currently no general methods to learn an inter-task mapping without requiring either background knowledge that is not typically present in RL settings, or an expensive analysis of an exponential number of inter-task mappings in the size of the state and action spaces. This paper introduces an autonomous framework that uses unsupervised manifold alignment to learn inter-task mappings and effectively transfer samples between different task domains. Empirical results on diverse dynamical systems, including an application to quadrotor control, demonstrate its effectiveness for cross-domain transfer in the context of policy gradient RL.
机译:将政策梯度加强学习(RL)应用于困难控制权的成功至关重要地对确定该政策的明智初始化的能力。转移学习方法通​​过重用从解决其他相关任务的知识来解决这个问题。在多个任务域的情况下,这些算法需要任务间映射,以促进跨域的知识传输。然而,目前没有一般方法来学习任务间映射,而无需在RL设置中通常存在的背景知识,或者在状态和动作的大小中对指数映射的指数数量的昂贵分析空间。本文介绍了一种自主框架,它使用无监督的歧管对齐来学习任务间映射,并有效地传输不同任务域之间的样本。在不同动态系统上的经验结果,包括对四轮压力机控制的应用,证明了在政策梯度RL的背景下的跨域转移的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号