首页>
外国专利>
Reinforcement learning methods, reinforcement learning devices and reinforcement learning programs for efficient learning
Reinforcement learning methods, reinforcement learning devices and reinforcement learning programs for efficient learning
展开▼
机译:强化学习方法,强化学习设备和强化学习程序,可实现高效学习
展开▼
页面导航
摘要
著录项
相似文献
摘要
PROBLEM TO BE SOLVED: To provide a reinforcement learning method for expression learning specialized in a state in which a attention state such as an environment reset occurs so that learning can be performed efficiently. SOLUTION: In a reinforcement learning method for optimizing an agent's behavioral policy from the result of learning using a learning device that learns based on a state observed from environmental data, it is set in advance during learning of environmental data in one episode. It is determined whether or not the state in which the attention situation has occurred has been observed. Then, when the state in which the attention situation occurs is observed, the feature extractor (first learner) uses two environmental data, the environmental data in the state in which the attention situation occurs and the environmental data at the previous time. To learn and perform expression learning. Then, the difference between the feature data is learned by the state classifier (second learner), and the parameters of the first learner and the second learner are updated based on the output estimation data and the actual data. [Selection diagram] Fig. 3
展开▼