首页> 外国专利> Methods, apparatus and computer program products for information retrieval and document classification utilizing a multidimensional subspace

Methods, apparatus and computer program products for information retrieval and document classification utilizing a multidimensional subspace

机译:用于利用多维子空间进行信息检索和文档分类的方法,装置和计算机程序产品

摘要

Methods, apparatus and computer program products are provided for retrieving information from a text data collection and for classifying a document into none, one or more of a plurality of predefined classes. In each aspect, a representation of at least a portion of the original matrix is projected into a lower dimensional subspace and those portions of the subspace representation that relate to the term(s) of the query are weighted following the projection into the lower dimensional subspace. In order to retrieve the documents that are most relevant with respect to a query, the documents are then scored with documents having better scores being of generally greater relevance. Alternatively, in order to classify a document, the relationship of the document to the classes of documents is scored with the document then being classified in those classes, if any, that have the best scores.
机译:提供了用于从文本数据集合中检索信息以及用于将文档分类为多个,预定的多个类别中的一个或多个的方法,装置和计算机程序产品。在每个方面,将原始矩阵的至少一部分的表示投影到较低维子空间中,并且将与查询的项相关的子空间表示的那些部分在投影到较低维子空间中之后加权。 。为了检索与查询最相关的文档,然后对文档进行评分,这些文档具有较好的分数通常具有更大的相关性。或者,为了分类文档,对文档与文档类别的关系进行评分,然后将文档分类为得分最高的那些类别(如果有)。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号