Preference Learning for Move Prediction and Evaluation Function Approximation in Othello

Runarsson T.P.; Lucas S.M.

首页> 外文期刊>Computational Intelligence and AI in Games, IEEE Transactions on >Preference Learning for Move Prediction and Evaluation Function Approximation in Othello

【24h】

Preference Learning for Move Prediction and Evaluation Function Approximation in Othello

机译：Othello中的移动预测和评估函数逼近的偏好学习

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper investigates the use of preference learning as an approach to move prediction and evaluation function approximation, using the game of Othello as a test domain. Using the same sets of features, we compare our approach with least squares temporal difference learning, direct classification, and with the Bradley–Terry model, fitted using minorization–maximization (MM). The results show that the exact way in which preference learning is applied is critical to achieving high performance. Best results were obtained using a combination of board inversion and pair-wise preference learning. This combination significantly outperformed the others under test, both in terms of move prediction accuracy, and in the level of play achieved when using the learned evaluation function as a move selector during game play.

机译：本文以奥赛罗博弈为测试域，研究了偏好学习作为移动预测和评估函数逼近的一种方法。使用相同的功能集，我们将我们的方法与最小二乘时差学习，直接分类以及使用最小化-最大化（MM）拟合的Bradley-Terry模型进行比较。结果表明，应用偏好学习的确切方法对于实现高性能至关重要。结合电路板反转和成对偏好学习可获得最佳结果。无论是在移动预测准确性方面，还是在游戏过程中使用学习的评估功能作为移动选择器时，这种组合在性能上均明显优于其他测试对象。</ p>

著录项

来源
《Computational Intelligence and AI in Games, IEEE Transactions on》 |2014年第3期|300-313|共14页
作者
Runarsson T.P.; Lucas S.M.;
展开▼
作者单位

School of Engineering and Natural Sciences, University of Iceland, Iceland;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Games; Monte Carlo methods; Radiation detectors; Standards; Training; Trajectory; Vectors; Computational and artificial intelligence; Othello; n-tuple; preference learning; temporal difference learning;

机译：游戏;蒙特卡洛方法;辐射探测器;标准;训练;弹道;向量;计算和人工智能;奥赛罗n元组偏好学习;时间差异学习;

相似文献

外文文献
中文文献
专利

1. Learning Zero-Sum Simultaneous-Move Markov Games Using Function Approximation and Correlated Equilibrium [J] . Qiaomin Xie, Yudong Chen, Zhaoran Wang, JMLR: Workshop and Conference Proceedings . 2020,第2010期

机译：使用功能近似和相关平衡学习零和同时移动马赛克游戏
2. An online prediction algorithm for reinforcement learning with linear function approximation using cross entropy method [J] . Ajin George Joseph, Shalabh Bhatnagar Machine Learning . 2018,第8a10期

机译：交叉熵法线性函数逼近的强化学习在线预测算法
3. A Machine Learning-based Surface Electromyography Topography Evaluation for Prognostic Prediction of Functional Restoration Rehabilitation in Chronic Low Back Pain [J] . Jiang Naifu, Luk Keith Dip-Kei, Hu Yong Spine . 2017,第21期

机译：一种基于机器学习的表面肌动画形貌评估，用于慢性低腰疼痛中功能恢复康复的预后预测
4. Effect of look-ahead search depth in learning position evaluation functions for Othello using -greedy exploration [C] . Runarsson, T.P., Jonsson, . 2007

机译：超前搜索深度对使用-贪婪探索的奥赛罗学习位置评估功能的影响
5. Comparison and evaluation of statistical-learning methods for gene function prediction in Arabidopsis thaliana. [D] . Lan, Hui. 2005

机译：拟南芥基因功能预测的统计学习方法的比较和评估。
6. The impact of incomplete knowledge on the evaluation of protein function prediction: a structured-output learning perspective [O] . Yuxiang Jiang, Wyatt T. Clark, Iddo Friedberg, -1

机译：不完全知识对蛋白质功能预测评估的影响：结构化输出学习观点
7. Preference learning for move prediction and evaluation function approximation in Othello [O] . Runarsson, TP, Lucas, SM 2014

机译：在Othello中进行运动预测和评估函数逼近的偏好学习

Preference Learning for Move Prediction and Evaluation Function Approximation in Othello

摘要

著录项

相似文献

相关主题

期刊订阅