首页> 外文期刊>Systems, Man and Cybernetics, IEEE Transactions on >Value Function Discovery in Markov Decision Processes With Evolutionary Algorithms
【24h】

Value Function Discovery in Markov Decision Processes With Evolutionary Algorithms

机译:马尔可夫决策过程中价值函数的进化算法发现

获取原文
获取原文并翻译 | 示例
       

摘要

In this paper, we introduce a novel method for the discovery of value functions for Markov decision processes (MDPs). This method, which we call value function discovery (VFD), is based on ideas from the evolutionary algorithm field. VFDs key feature is that it discovers descriptions of value functions that are algebraic in nature. This feature is unique, because the descriptions include the model parameters of the MDP. The algebraic expression of the value function discovered by VFD can be used in several scenarios, e.g., conversion to a policy (with one-step policy improvement) or control of systems with time-varying parameters. The work in this paper is a first step toward exploring potential usage scenarios of discovered value functions. We give a detailed description of VFD and illustrate its application on an example MDP. For this MDP, we let VFD discover an algebraic description of a value function that closely resembles the optimal value function. The discovered value function is then used to obtain a policy, which we compare numerically to the optimal policy of the MDP. The resulting policy shows near-optimal performance on a wide range of model parameters. Finally, we identify and discuss future application scenarios of discovered value functions.
机译:在本文中,我们介绍了一种用于发现马尔可夫决策过程(MDP)的价值函数的新方法。我们将这种方法称为价值函数发现(VFD),它基于进化算法领域的思想。 VFD的主要功能是发现本质上是代数的值函数的描述。此功能是唯一的,因为描述包括MDP的模型参数。由VFD发现的价值函数的代数表达式可以在多种情况下使用,例如转换为策略(单步改进策略)或控制具有时变参数的系统。本文的工作是探索发现的价值函数的潜在使用场景的第一步。我们将对VFD进行详细说明,并说明其在示例MDP上的应用。对于此MDP,我们让VFD发现与最佳值函数非常相似的值函数的代数描述。然后,将发现的价值函数用于获取策略,然后将其与MDP的最优策略进行数值比较。结果策略在各种模型参数上显示出接近最佳的性能。最后,我们确定并讨论发现的价值函数的未来应用场景。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号