首页> 外国专利> METHODS AND SYSTEMS FOR IDENTIFYING A LEVEL OF SIMILARITY BETWEEN A FILTERING CRITERION AND A DATA ITEM WITHIN A SET OF STREAMED DOCUMENTS

METHODS AND SYSTEMS FOR IDENTIFYING A LEVEL OF SIMILARITY BETWEEN A FILTERING CRITERION AND A DATA ITEM WITHIN A SET OF STREAMED DOCUMENTS

机译:在一组流化文档中识别过滤条件和数据项之间相似度的方法和系统

摘要

A method enables identification of a similarity level between a user-provided data item and a data item within a set of data documents. The method includes a representation generator determining, for each term in an enumeration of terms, occurrence information. The representation generator generates, for each term, a sparse distributed representation (SDR) using the occurrence information. The method includes receiving, by a filtering module, a filtering criterion including at least one of a security-based term or a brand-based term. The method includes generating, by the representation generator, for the filtering criterion, at least one SDR. The method includes generating, by the representation generator, for a first of a plurality of streamed documents received from a data source, a compound SDR. The method includes determining, by a similarity engine, a distance between the filtering criterion SDR and the compound SDR. The method includes acting on the document, based upon the distance. FIG. 15
机译:一种方法使得能够识别用户提供的数据项和一组数据文档内的数据项之间的相似度。该方法包括表示生成器,其针对项的枚举中的每个项确定出现信息。表示生成器使用出现信息为每个术语生成一个稀疏分布表示(SDR)。该方法包括由过滤模块接收过滤准则,该过滤准则包括基于安全性的术语或基于品牌的术语中的至少一个。该方法包括由表示生成器为过滤标准生成至少一个SDR。该方法包括由表示生成器为从数据源接收的多个流文档中的第一个生成复合SDR。该方法包括通过相似性引擎确定过滤标准SDR和复合SDR之间的距离。该方法包括基于距离作用在文档上。图。 15

著录项

  • 公开/公告号IN201847001938A

    专利类型

  • 公开/公告日2018-04-06

    原文格式PDF

  • 申请/专利权人

    申请/专利号IN201847001938

  • 发明设计人 DE SOUSA WEBBER FRANCISCO;

    申请日2018-01-17

  • 分类号G06F17/30;

  • 国家 IN

  • 入库时间 2022-08-21 12:51:44

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号