A Semi-Markov Decision Model With Inverse Reinforcement Learning for Recognizing the Destination of a Maneuvering Agent in Real Time Strategy Games

Zeng Yunxiu; Xu Kai; Qin Long; Yin Quanjun

首页> 外文期刊>Quality Control, Transactions >A Semi-Markov Decision Model With Inverse Reinforcement Learning for Recognizing the Destination of a Maneuvering Agent in Real Time Strategy Games

【24h】

A Semi-Markov Decision Model With Inverse Reinforcement Learning for Recognizing the Destination of a Maneuvering Agent in Real Time Strategy Games

机译：具有反增强学习的半马尔可夫决策模型，用于识别实时战略游戏中的机动代理目的地

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Recognizing the destination of a maneuvering agent is important to create intelligent AI players in Real Time Strategy (RTS) games. Among different ways of problem formulation, goal recognition can be solved as a model-based planning problem using off-the-shelf planners. However, the common problem in these frameworks is that they usually lack of the modeling of the action duration as in real-world scenarios the agent may take several steps to transit between grids. To solve this problem, a semi-Markov decision model (SMDM), which explicitly models the duration of an action, is proposed in this paper. Besides, most of the current works do not establish a behavioral model of the identified person, and there is almost no work modeling individual behavioral preference, which limits the accuracy of the recognition results. In this paper, the Inverse Reinforcement Learning (IRL) method is adopted in opponent behavior learning for the destination recognition problem. To adapt to the dynamic environment, the Maximum Entropy Inverse Reinforcement Learning (MaxEnt IRL) method is transformed by defining a Fitness index to measure the effect of weight and use the Nelder-Mead polyhedron search to find the optimal weight. In experiments, we build the game scenario in the Unreal Engine 4 environment and collect the moving trajectories from the human players in several different tasks for evaluating the performance of our methods. The results show that the recognizer using IRL can recognize the destination effectively even if the intention changes during the midway, and it performs better than other models in terms of several most frequently-used metrics.

机译：认识到机动代理的目的地是在实时战略（RTS）游戏中创建智能AI玩家的重要性。在不同的问题制定方式中，使用现成的规划者可以解决目标识别作为基于模型的规划问题。然而，这些框架中的常见问题是，他们通常缺乏在真实世界中缺少动作持续时间的建模，代理可能需要几个步骤来在网格之间传输。为了解决这个问题，在本文中提出了一个半马尔可夫决策模型（SMDM），其明确地模拟了动作的持续时间。此外，大多数当前作品都不建立所确定的人的行为模型，并且几乎没有工作建模个人行为偏好，这限制了识别结果的准确性。本文采用了逆钢筋学习（IRL）方法对目的地识别问题的对手行为学习。为了适应动态环境，通过定义健身索引来测量重量的效果并使用Nelder-Mead PolyheDron搜索找到最佳重量来改变最大熵逆加强学习（MaxEnt IRL）方法。在实验中，我们在虚幻发动机4环境中建立游戏场景，并在几个不同任务中从人类玩家中收集移动轨迹，以评估我们的方法的性能。结果表明，即使在中途的意图变化，识别器也可以有效地识别目的地，并且在几种最常用的指标方面，它比其他模型更好地表现得比其他模型更好。

著录项

来源
《Quality Control, Transactions》 |2020年第2020期|15392-15409|共18页
作者
Zeng Yunxiu; Xu Kai; Qin Long; Yin Quanjun;
展开▼
作者单位

Natl Univ Def Technol Coll Syst Engn Changsha 410073 Peoples R China;

Natl Univ Def Technol Coll Syst Engn Changsha 410073 Peoples R China;

Natl Univ Def Technol Coll Syst Engn Changsha 410073 Peoples R China;

Natl Univ Def Technol Coll Syst Engn Changsha 410073 Peoples R China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Real time strategy games; goal recognition; inverse reinforcement learning;

机译：实时战略游戏;目标识别;反钢筋学习;

相似文献

外文文献
专利

1. A Semi-Markov Decision Model for Recognizing the Destination of a Maneuvering Agent in Real Time Strategy Games [J] . Quanjun Yin, Shiguang Yue, Yabing Zha, Mathematical Problems in Engineering . 2016,第1期

机译：实时战略游戏中识别机动代理人目的地的半马尔可夫决策模型
2. Markov-game modeling of cyclist-pedestrian interactions in shared spaces: A multi-agent adversarial inverse reinforcement learning approach [J] . Alsaleh Rushdi, Sayed Tarek Transportation research . 2021,第Jula期

机译：广播空间中骑自行车者行人互动的马尔可夫 - 游戏模型
3. A single-task and multi-decision evolutionary game model based on multi-agent reinforcement learning [J] . MA Ye, CHANG Tianqing, FAN Wenhui 系统工程与电子技术（英文版） . 2021,第003期

机译：A single-task and multi-decision evolutionary game model based on multi-agent reinforcement learning
4. A two-layer semi-Markov model for recognizing the destination of a moving agent [C] . Shiguang Yue, Kai Xu, Long Qin, IEEE International Conference on Mechatronics and Automation . 2015

机译：用于识别移动主体目的地的两层半马尔可夫模型
5. Large-Scale Multi-Agent Decision-Making Using Mean Field Game Theory and Reinforcement Learning [D] . Zhou, Zejian. 2021

机译：使用均值野外博弈论和强化学习的大规模多代理决策
6. Multi-agent reinforcement learning with approximate model learning for competitive games [O] . Young Joon Park, Yoon Sang Cho, Seoung Bum Kim 2012

机译：多主体强化学习和近似模型学习的竞技游戏
7. Deep RTS: A Game Environment for Deep Reinforcement Learning in Real-Time Strategy Games [O] . Per-Arne Andersen, Morten Goodwin, Ole-Christoffer Granmo 2018

机译：深射击：实时战略游戏中深度加强学习的游戏环境

A Semi-Markov Decision Model With Inverse Reinforcement Learning for Recognizing the Destination of a Maneuvering Agent in Real Time Strategy Games

摘要

著录项

相似文献

相关主题

期刊订阅