首页> 外国专利> SYSTEMS AND METHODS FOR INTELLIGENTLY CURATING MACHINE LEARNING TRAINING DATA AND IMPROVING MACHINE LEARNING MODEL PERFORMANCE

SYSTEMS AND METHODS FOR INTELLIGENTLY CURATING MACHINE LEARNING TRAINING DATA AND IMPROVING MACHINE LEARNING MODEL PERFORMANCE

机译:智能地培养机器学习训练数据并提高机器学习模型性能的系统和方法

摘要

Systems and methods of intelligent formation and acquisition of machine learning training data for implementing an artificially intelligent dialogue system includes constructing a corpora of machine learning test corpus that comprise a plurality of historical queries and commands sampled from production logs of a deployed dialogue system; configuring training data sourcing parameters to source a corpora of raw machine learning training data from remote sources of machine learning training data; calculating efficacy metrics of the corpora of raw machine learning training data, wherein calculating the efficacy metrics includes calculating one or more of a coverage metric value and a diversity metric value of the corpora of raw machine learning training data; using the corpora of raw machine learning training data to train the at least one machine learning classifier if the calculated coverage metric value of the corpora of machine learning training data satisfies a minimum coverage metric threshold.
机译:用于形成人工智能对话系统的智能形成和获取机器学习训练数据的系统和方法包括:构造机器学习测试语料库,该语料库包括从部署的对话系统的生产日志中采样的多个历史查询和命令;配置训练数据获取参数以从远程机器学习训练数据源中获取大量原始机器学习训练数据;计算原始机器学习训练数据的语料库的功效度量,其中,计算功效度量包括计算原始机器学习训练数据的语料库的覆盖度量值和分集度量值中的一个或多个;如果所计算的机器学习训练数据的语料库的覆盖率度量值满足最小覆盖率度量阈值,则使用原始机器学习训练数据的语料库来训练至少一个机器学习分类器。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号