首页> 外文期刊>Information Processing & Management >Construction of weak and strong similarity measures for ordered sets of documents using fuzzy set techniques
【24h】

Construction of weak and strong similarity measures for ordered sets of documents using fuzzy set techniques

机译:使用模糊集技术构造有序文档集的弱相似性度量和强相似性度量

获取原文
获取原文并翻译 | 示例
           

摘要

Ordered sets of documents are encountered more and more in information distribution systems, such as information retrieval systems. Classical similarity measures for ordinary sets of documents hence need to be extended to these ordered sets. This is done in this paper using fuzzy set techniques. First a general similarity measure is developed which contains the classical strong similarity measures such as Jaccard, Dice, Cosine and which contains the classical weak similarity measures such as Recall and Precision. Then these measures are extended to comparing fuzzy sets of documents. Measuring the similarity for ordered sets of documents is a special case of this, where, the higher the rank of a document, the lower its weight is in the fuzzy set. Concrete forms of these similarity measures are presented. All these measures are new and the ones for the weak similarity measures are the first of this kind (other strong similarity measures have been given in a previous paper by Egghe and Michel). Some of these measures are then tested in the IR-system Profil-Doc. The engine SPIRIT extracts ranked documents sets in three different contexts, each for 600 request. The practical useability of the OS-measures is then discussed based on these experiments.
机译:在信息分发系统(例如信息检索系统)中,越来越多地遇到文档的有序集合。因此,普通文档集的经典相似性度量需要扩展到这些有序集。这是使用模糊集技术完成的。首先,开发了一种通用相似性度量,其中包含经典强相似性度量(例如Jaccard,Dice,Cosine),并且包含经典弱相似性度量(例如Recall和Precision)。然后将这些措施扩展到比较文档的模糊集。测量有序文档集的相似度是这种情况的一种特殊情况,其中,文档等级越高,模糊集中文档的权重就越低。介绍了这些相似性度量的具体形式。所有这些措施都是新措施,而针对弱相似性措施的措施是此类措施中的第一个(其他强相似性措施已在Egghe和Michel的先前论文中给出)。然后在IR系统的Profil-Doc中测试其中一些措施。引擎SPIRIT在三个不同的上下文中提取排序的文档集,每个上下文针对600个请求。然后,基于这些实验讨论了OS措施的实际可用性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号