qg) than the distribution p(ôg) of goal state trajectories ôg and sampling the goal state trajectories ôg with the prioritised sampling distribution q(ôg) the AI system/agent is trained to achieve unseen goals by learning from diverse achieved goal states uniformly."/> MAXIMUM ENTROPY REGULARISED MULTI-GOAL REINFORCEMENT LEARNING
首页> 外国专利> MAXIMUM ENTROPY REGULARISED MULTI-GOAL REINFORCEMENT LEARNING

MAXIMUM ENTROPY REGULARISED MULTI-GOAL REINFORCEMENT LEARNING

机译:最大熵调节的多目标强化学习

摘要

The present invention is related to a computer-implemented method of training artificial intelligence (AI) systems or rather agents (Maximum Entropy Regularised multi-goal Reinforcement Learning), in particular, an AI system/agent for controlling a technical system. By constructing a prioritised sampling distribution q(ôg) with a higher entropy custom-characterqg) than the distribution p(ôg) of goal state trajectories ôg and sampling the goal state trajectories ôg with the prioritised sampling distribution q(ôg) the AI system/agent is trained to achieve unseen goals by learning from diverse achieved goal states uniformly.
机译:本发明涉及训练人工智能(AI)系统或代理(最大熵正则化多目标强化学习),特别是用于控制技术系统的AI系统/代理的计算机实施方法。通过构造具有较高熵的优先采样分布q(ô g “ q (Ô g )比分布p(ô g )目标状态轨迹ô g 并使用优先采样分布q(ô g )AI系统/代理对目标状态轨迹ô g 通过统一学习各种已实现目标的状态来训练未实现的目标。

著录项

  • 公开/公告号US2020334565A1

    专利类型

  • 公开/公告日2020-10-22

    原文格式PDF

  • 申请/专利权人 SIEMENS AKTIENGESELLSCHAFT;

    申请/专利号US201916385209

  • 发明设计人 VOLKER TRESP;RUI ZHAO;

    申请日2019-04-16

  • 分类号G06N20;G06N3/08;G06N5/04;

  • 国家 US

  • 入库时间 2022-08-21 11:24:36

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号