首页> 外文会议>Graph-based methods for natural language processing workshop 2016 >Network Motifs May Improve Quality Assessment of Text Documents
【24h】

Network Motifs May Improve Quality Assessment of Text Documents

机译:网络主题可能会改善文本文档的质量评估

获取原文
获取原文并翻译 | 示例

摘要

Motif analysis counts the number of small building blocks (the motifs) in a network and relates these statistical numbers to the inherent semantics of the network. In the realm of natural language processing, the networks are induced by texts. We demonstrate that motif analysis may help assess the quality of a document. More specifically, we consider the German Wikipedia and use the label "featured" as the (binary) quality criterion. The length (number of words) of an article is a comparatively good predictor for this label. We show that a well-designed combination of this criterion and motif statistics yields a significant improvement. We also found that a deeper look into the most relevant motifs may improve our understanding of quality.
机译:母题分析计算网络中小的构建基块(主题)的数量,并将这些统计数字与网络的固有语义相关联。在自然语言处理领域,网络是由文本引起的。我们证明,主题分析可能有助于评估文档的质量。更具体地说,我们考虑德语Wikipedia,并使用标签“功能”作为(二进制)质量标准。文章的长度(单词数)是该标签相对较好的预测指标。我们表明,精心设计的此标准和主题统计信息的组合可带来显着的改进。我们还发现,深入了解最相关的主题可能会增进我们对质量的理解。

著录项

  • 来源
  • 会议地点 San Diego CA(US)
  • 作者

    Thomas Arnold; Karsten Weihe;

  • 作者单位

    Research Training Group AIPHES / Algorithmic Group Department of Computer Science, Technische Universitaet Darmstadt Hochschulstrasse 10, 64289 Darmstadt, Germany;

    Research Training Group AIPHES / Algorithmic Group Department of Computer Science, Technische Universitaet Darmstadt Hochschulstrasse 10, 64289 Darmstadt, Germany;

  • 会议组织
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号