首页> 外国专利> CLASSIFYING BUSINESS SUMMARIES AGAINST A HIERARCHICAL INDUSTRY CLASSIFICATION STRUCTURE USING SUPERVISED MACHINE LEARNING

CLASSIFYING BUSINESS SUMMARIES AGAINST A HIERARCHICAL INDUSTRY CLASSIFICATION STRUCTURE USING SUPERVISED MACHINE LEARNING

机译:使用监督机器学习对分层工业分类结构进行分类业务摘要

摘要

A classification system is provided for classifying text-based business summaries, referred to herein as “summaries,” against a hierarchical industry classification structure. The classification system includes a word-based sub classifier that uses a neural network to generate a vector space for each summary in a training set, where each summary in the training set is known to correspond to a particular industry classification in the hierarchical industry classification structure. Weight values in the hidden layer of a neural network used by the word-based sub classifier are changed to improve the predictive capabilities of the neural network in the business summary classification context. Embodiments include increasing representation in the training set for underrepresented parent industry classifications and attributes of the hierarchical industry classification structure, such as distances between industry classifications and whether industry classifications are in the same subgraph. The completion of training of the word-based sub classifier is based upon whether a performance metric, such as an hF1 score, satisfies one or more early stopping criteria. The classification system also includes a category-based sub classifier and a meta classifier.
机译:提供了一个分类系统,用于将基于文本的业务摘要分类,在此称为“摘要”,反对分层工业分类结构。分类系统包括基于词的子分类器,该子分类器使用神经网络在训练集中生成每个摘要的矢量空间,其中已知训练集中的每个摘要对应于分层行业分类结构中的特定行业分类。由基于词的子分类器使用的神经网络的隐藏层中的重量值改变以改善业务摘要分类上下文中神经网络的预测能力。实施例包括越来越多的培训表中的培训,该培训为代表性的父母行业分类和分层工业分类结构的属性,例如行业分类之间的距离以及行业分类是否在同一子图中。基于词的子分类器的训练完成基于诸如HF 1 得分之类的性能度量,满足一个或多个早期停止标准。分类系统还包括基于类别的子分类器和元分类器。

著录项

  • 公开/公告号US2021064956A1

    专利类型

  • 公开/公告日2021-03-04

    原文格式PDF

  • 申请/专利权人 THE DUN & BRADSTREET CORPORATION;

    申请/专利号US201916559963

  • 发明设计人 NIKITA ZHILTSOV;

    申请日2019-09-04

  • 分类号G06N3/04;G06K9/62;G06F17/18;G06F17/16;

  • 国家 US

  • 入库时间 2022-08-24 17:29:54

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号