首页> 外文期刊>Computational Intelligence and AI in Games, IEEE Transactions on >DeepQA Jeopardy! Gamification: A Machine-Learning Perspective
【24h】

DeepQA Jeopardy! Gamification: A Machine-Learning Perspective

机译:DeepQA危险!游戏化:机器学习的视角

获取原文
获取原文并翻译 | 示例
           

摘要

DeepQA is a large-scale natural language processing (NLP) question-and-answer system that responds across a breadth of structured and unstructured data, from hundreds of analytics that are combined with over 50 models, trained through machine learning. After the 2011 historic milestone of defeating the two best human players in the Jeopardy! game show, the technology behind IBM Watson, DeepQA, is undergoing gamification into real-world business problems. Gamifying a business domain for Watson is a composite of functional, content, and training adaptation for nongame play. During domain gamification for medical, financial, government, or any other business, each system change affects the machine-learning process. As opposed to the original Watson Jeopardy!, whose class distribution of positive-to-negative labels is 1:100, in adaptation the computed training instances, question-and-answer pairs transformed into true–false labels, result in a very low positive-to-negative ratio of 1:100 000. Such initial extreme class imbalance during domain gamification poses a big challenge for the Watson machine-learning pipelines. The combination of ingested corpus sets, question-and-answer pairs, configuration settings, and NLP algorithms contribute toward the challenging data state. We propose several data engineering techniques, such as answer key vetting and expansion, source ingestion, oversampling classes, and question set modifications to increase the computed true labels. In addition, algorithm engineering, such as an implementation of the Newton–Raphson logistic regression with a regularization term, relaxes the constraints of class imbalance during training adaptation. We conclude by empirically demonstrating that data and algorithm engineering are complementary and indispensable to overcome the challenges in this first Watson gamification for real-world business problems.
机译:DeepQA是一个大型自然语言处理(NLP)问答系统,它通过基于机器学习训练的数百种分析与50多种模型相结合,对各种结构化和非结构化数据进行响应。在2011年历史性的里程碑中,击败了危险中的两个最佳人类玩家!游戏节目中,IBM Watson背​​后的技术DeepQA正在对现实业务问题进行游戏化。为Watson游戏化业务领域是针对非游戏性的功能,内容和培训适应性的综合。在医疗,金融,政府或任何其他业务的领域游戏化期间,每个系统更改都会影响机器学习过程。相对于原始的Watson Jeopardy !,其正负标签的类分布为1:100,在适应所计算的训练实例时,将问答集转换为真假标签,导致正值非常低-负比率为1:100000。在域游戏化过程中,这种最初的极端阶级失衡给Watson机器学习管道带来了巨大挑战。摄取的语料库集,问答对,配置设置和NLP算法的组合会导致具有挑战性的数据状态。我们提出了几种数据工程技术,例如答案键审核和扩展,源提取,过度采样类以及对问题集的修改以增加计算出的真实标签。此外,算法工程(例如带有正则化项的Newton-Raphson logistic回归的实现)在训练适应期间放松了班级不平衡的约束。最后,我们通过经验证明,数据和算法工程是互补的和必不可少的,以克服针对实际业务问题的第一个Watson游戏化中的挑战。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号