...
首页> 外文期刊>Adaptive Behavior >A robust policy bootstrapping algorithm for multi-objective reinforcement learning in non-stationary environments
【24h】

A robust policy bootstrapping algorithm for multi-objective reinforcement learning in non-stationary environments

机译:非静止环境中多目标强力学习的强大策略自动启动算法

获取原文
获取原文并翻译 | 示例
           

摘要

Multi-objective Markov decision processes are a special kind of multi-objective optimization problem that involves sequential decision making while satisfying the Markov property of stochastic processes. Multi-objective reinforcement learning methods address this kind of problem by fusing the reinforcement learning paradigm with multi-objective optimization techniques. One major drawback of these methods is the lack of adaptability to non-stationary dynamics in the environment. This is because they adopt optimization procedures that assume stationarity in order to evolve a coverage set of policies that can solve the problem. This article introduces a developmental optimization approach that can evolve the policy coverage set while exploring the preference space over the defined objectives in an online manner. We propose a novel multi-objective reinforcement learning algorithm that can robustly evolve a convex coverage set of policies in an online manner in non-stationary environments. We compare the proposed algorithm with two state-of-the-art multi-objective reinforcement learning algorithms in stationary and non-stationary environments. Results showed that the proposed algorithm significantly outperforms the existing algorithms in non-stationary environments while achieving comparable results in stationary environments.
机译:多目标马尔可夫决策过程是一种特殊的多目标优化问题,涉及顺序决策,同时满足随机过程的马尔可夫属性。多目标强化学习方法通​​过利用多目标优化技术融合强化学习范式来解决这种问题。这些方法的一个主要缺点是缺乏对环境中非静止动力的适应性。这是因为它们采用了假设实质性的优化程序,以便演变可以解决问题的覆盖策略集。本文介绍了一种发展优化方法,可以在以在线方式探索定义目标的偏好空间的同时发展策略覆盖集。我们提出了一种新型的多目标强化学习算法,可以在非静止环境中以在线方式强大地发展凸覆盖策略集。我们将所提出的算法与静止和非静止环境中的两个最先进的多目标强力学习算法进行比较。结果表明,该算法在非静止环境中显着优于现有的算法,同时在静止环境中实现可比结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号