首页> 外文会议>Advances in data and web management >Using Link-Based Content Analysis to Measure Document Similarity Effectively
【24h】

Using Link-Based Content Analysis to Measure Document Similarity Effectively

机译:使用基于链接的内容分析有效测量文档相似度

获取原文
获取原文并翻译 | 示例

摘要

Along with a massive amount of information being placed online, it is a challenge to exploit the internal and external information of documents when assessing similarity between them. A variety of approaches have been proposed to model the document similarity based on different foundations, but usually they are not applicable for combining internal and external information. In this paper, we introduce a link-based method into content analysis, which is based on random walk on graphs. By defining similarity as the meeting probability of two random surfers, we propose a computational model for content analysis, which can also be integrated with external information of documents. Empirical study shows that our method achieves good accuracy, acceptable performance and fast convergent rate in multi-relational document similarity measuring.
机译:随着大量信息在线发布,在评估文档之间的相似性时利用文档的内部和外部信息是一项挑战。已经提出了多种方法来基于不同的基础来建模文档相似性,但是通常它们不适用于组合内部和外部信息。在本文中,我们将基于链接的方法引入到内容分析中,该方法基于图上的随机游动。通过将相似性定义为两个随机冲浪者的会合概率,我们提出了一种用于内容分析的计算模型,该模型也可以与文档的外部信息集成。实证研究表明,该方法在多关系文档相似度测量中具有良好的准确性,可接受的性能和快速的收敛速度。

著录项

  • 来源
  • 会议地点 Suzhou(CN)
  • 作者单位

    Key Labs of Data Engineering and Knowledge Engineering, Ministry of Education, China School of Information, Renmin University of China, Beijing, China;

    Key Labs of Data Engineering and Knowledge Engineering, Ministry of Education, China School of Information, Renmin University of China, Beijing, China;

    Department of Management Science and Engineering, Tsinghua University, Beijing, China;

    Key Labs of Data Engineering and Knowledge Engineering, Ministry of Education, China School of Information, Renmin University of China, Beijing, China;

    Key Labs of Data Engineering and Knowledge Engineering, Ministry of Education, China School of Information, Renmin University of China, Beijing, China;

  • 会议组织
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 计算机网络;
  • 关键词

    link graph; content analysis; document similarity;

    机译:链接图;内容分析;文件相似度;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号