首页> 外国专利> Machine learning device, robot system and machine learning method for learning a movement of a robot that is involved in a task jointly performed by a human and a robot

Machine learning device, robot system and machine learning method for learning a movement of a robot that is involved in a task jointly performed by a human and a robot

机译:机器学习设备,机器人系统和机器学习方法,用于学习由人和机器人共同执行的任务中涉及的机器人的运动

摘要

A robot system comprising: a machine learning device for learning a movement of a robot that is involved in a task jointly carried out by a human (1) and a robot (3), comprising: - a state monitoring unit (21) that monitors a state variable that has a Indicates the state of the robot (3) when the human (1) and the robot (3) work together and perform a task, - a reward calculation unit (22), based on control data and the state variable for controlling the robot (3) and an action of the human (1) calculates a reward, and - a value function updating unit (23) which updates an action value function for controlling a movement of the robot (3) based on the reward and the state variable, - the robot (3), which together with the human (1) performs a task; - a robot control unit (30) that controls a movement of the robot (3); and - a task intention recognition unit (51) which receives an output of a camera (44), a force sensor (45, 45a, 45b), a touch sensor (41), a microphone (42) and an input device (43) and an intention relating to a Recognizes the task, the machine learning device (2) learning a movement of the robot (3) by analyzing a distribution of feature points or workpieces (W) after the human (1) and the robot (3) have worked together and performed the task, the state variable input into the state monitoring unit (21) of the machine learning device (2) comprises an output of the task intention recognition unit (51), and wherein the task intention recognition unit (51) converts a positive reward based on an action of the human being (1) into a positive reward Converts the state variable and outputs the state variable to the state monitoring unit (21), - one based on an action of the person (1) converting the negative reward into a state variable established for the negative reward and outputs the state variable to the state monitoring unit (21), and wherein the reward calculation unit (22) calculates the reward by adding a second reward based on the action of the person (1) to one based on the control data and the first reward is calculated based on the state variable.
机译:一个机器人系统,包括:用于学习由人(1)和机器人(3)共同执行的任务涉及的机器人移动的机器学习装置,包括: - 监视的状态监测单元(21)具有A表示机器人(3)的状态变量,当人(1)和机器人(3)一起工作并执行任务时, - 基于控制数据和状态,奖励计算单元(22)用于控制机器人(3)的变量和人(1)的动作计算奖励,并且 - 一种值函数更新单元(23),其更新用于基于所述机器人(3)的移动的动作值功能奖励和状态变量 - 机器人(3),与人(1)一起执行任务; - 控制机器人(3)移动的机器人控制单元(30);和 - 任务意图识别单元(51),其接收相机(44),力传感器(45a,45b),触摸传感器(41),麦克风(42)和输入装置(43 )与认识到任务的意图,机器学习装置(2)通过分析人(1)和机器人之后的特征点或工件(W)的分布来学习机器人(3)的运动(3)(3 )已经一起工作并执行任务,输入机器学习设备(2)的状态监测单元(21)中的状态变量包括任务意图识别单元(51)的输出,并且其中任务意图识别单元( 51)将基于人类(1)的动作转换为正奖励的正奖励转换为状态变量并将状态变量输出到状态监测单元(21), - 基于人的动作(1 )将负奖励转换为为负奖励和输出的状态变量而转换为州变量到状态监测单元(21)的变量,并且其中奖励计算单元(22)通过基于控制数据的人(1)到一个基于人员(1)的操作添加第二奖励来计算奖励,并且第一个奖励是基于状态变量计算。

著录项

  • 公开/公告号DE102017007729B4

    专利类型

  • 公开/公告日2021-09-16

    原文格式PDF

  • 申请/专利权人 FANUC CORPORATION;

    申请/专利号DE20171007729

  • 发明设计人 SHUNICHI OZAKI;HIROJI NISHI;

    申请日2017-08-16

  • 分类号B25J9/22;

  • 国家 DE

  • 入库时间 2022-08-24 21:06:11

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号