【24h】

On Learning From Game Annotations

机译:从游戏注释中学习

获取原文
获取原文并翻译 | 示例
           

摘要

Most of the research in the area of evaluation function learning is focused on self-play. However in many domains, like Chess, expert feedback is amply available in the form of annotated games. This feedback usually comes in the form of qualitative information because human annotators find it hard to determine precise utility values for game states. The goal of this work is to investigate inasmuch it is possible to leverage this qualitative feedback for learning an evaluation function for the game. To this end, we show how the game annotations can be translated into preference statements over moves and game states, which in turn can be used for learning a utility function that respects these preference constraints. We evaluate the resulting function by creating multiple heuristics based upon different sized subsets of the training data and compare them in a tournament scenario. The results showed that learning from game annotations is possible, but, on the other hand, our learned functions did not quite reach the performance of the original, manually tuned function of the Chess program. The reason for this failure seems to lie in the fact that human annotators only annotate “interesting” positions, so that it is hard to learn basic information, such as material advantage from game annotations alone.
机译:评估功能学习领域的大多数研究都集中在自我玩耍上。但是,在象棋这样的许多领域中,都可以以带注释的游戏的形式充分获得专家的反馈。这种反馈通常以定性信息的形式出现,因为人类注释者发现很难确定游戏状态的精确效用值。这项工作的目的是调查是否有可能利用这种定性反馈来学习游戏的评估功能。为此,我们展示了如何将游戏注释转换为关于动作和游戏状态的偏好声明,进而可以用于学习遵守这些偏好约束的效用函数。我们通过基于训练数据的不同大小的子集创建多个试探法来评估结果函数,并在锦标赛场景中对其进行比较。结果表明,可以从游戏注释中学习,但是,另一方面,我们学习的功能并不能完全达到国际象棋程序原始的手动调整功能的性能。失败的原因似乎在于以下事实:人类注释者仅注释“有趣”的位置,因此很难学习基本信息,例如仅从游戏注释中获得物质优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号