首页> 外国专利> LIFELONG LEARNING WITH A CHANGING ACTION SET

LIFELONG LEARNING WITH A CHANGING ACTION SET

机译:终身学习改变动作集

摘要

Systems and methods are described for a decision-making process that includes an increasing set of actions, compute a policy function for a Markov decision process (MDP) for the decision-making process, wherein the policy function is computed based on a state conditional function mapping states into an embedding space, an inverse dynamics function mapping state transitions into the embedding space, and an action selection function mapping the elements of the embedding space to actions, identify an additional set of actions in the increasing set of actions, update the inverse dynamics function based at least in part on the additional set of actions, update the policy function based on the updated inverse dynamics function and parameters learned during the computing the policy function, and select an action based on the updated policy function.
机译:描述了系统和方法,用于决策过程,该决策过程包括增加的动作,计算用于决策过程的马尔可夫决策过程(MDP)的策略函数,其中基于状态条件函数计算策略功能将状态映射到嵌入空间,逆动力学函数映射状态转换到嵌入空间中,以及将嵌入空间的元素映射到动作的动作选择函数,在越来越多的动作中识别一组额外的动作,更新逆动态函数至少部分地基于附加的操作集,根据更新的逆动力学函数和计算策略函数期间了解的参数更新策略函数,并根据更新的策略函数选择一个操作。

著录项

  • 公开/公告号US2021089958A1

    专利类型

  • 公开/公告日2021-03-25

    原文格式PDF

  • 申请/专利权人 ADOBE INC.;

    申请/专利号US201916578913

  • 发明设计人 GEORGIOS THEOCHAROUS;YASH CHANDAK;

    申请日2019-09-23

  • 分类号G06N20;G06N5/04;G06N7;

  • 国家 US

  • 入库时间 2022-08-24 17:54:18

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号