首页> 外国专利> Identification and Rejection of Meaningless Input During Natural Language Classification

Identification and Rejection of Meaningless Input During Natural Language Classification

机译:自然语言分类中无意义输入的识别与拒绝

摘要

A method for identifying data that is meaningless and generating a natural language statistical model which can reject meaningless input. The method can include identifying unigrams that are individually meaningless from a set of training data. At least a portion of the unigrams identified as being meaningless can be assigned to a first n-gram class. The method also can include identifying bigrams that are entirely composed of meaningless unigrams and determining whether the identified bigrams are individually meaningless. At least a portion of the bigrams identified as being individually meaningless can be assigned to the first n-gram class.
机译:一种识别无意义数据并生成可以拒绝无意义输入的自然语言统计模型的方法。该方法可以包括从一组训练数据中识别分别无意义的字母组合。标识为无意义的字母组合的至少一部分可以分配给第一n-gram类。该方法还可以包括:识别完全由无意义的单字组组成的二元组;以及确定所识别的二元组是否分别是无意义的。被识别为单独无意义的二元组的至少一部分可以被分配给第一n元组。

著录项

  • 公开/公告号US2007244692A1

    专利类型

  • 公开/公告日2007-10-18

    原文格式PDF

  • 申请/专利权人 RAJESH BALCHANDRAN;LINDA BOYER;

    申请/专利号US20060279577

  • 发明设计人 RAJESH BALCHANDRAN;LINDA BOYER;

    申请日2006-04-13

  • 分类号G06F17/27;

  • 国家 US

  • 入库时间 2022-08-21 21:06:47

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号