...
首页> 外文期刊>Knowledge-Based Systems >Adaptive online event detection in news streams
【24h】

Adaptive online event detection in news streams

机译:新闻流中的自适应在线事件检测

获取原文
获取原文并翻译 | 示例
           

摘要

Event detection aims to discover news documents that report on the same event and arrange them under the same group. With the explosive growth of online news, there is a need for event detection to facilitate better navigation for users in news spaces. Existing works usually represent documents based on TF-IDF scheme and use a clustering algorithm for event detection. However, traditional TF-IDF vector representation suffers problems of high dimension and sparse semantics. In addition, with more news documents coming, IDF need to be incrementally updated. In this paper, we present a novel document representation method based on word embeddings, which reduces the dimension and alleviates the sparse semantics compared to TF-IDF, and thus improves the efficiency and accuracy. Based on the document representation, we propose an adaptive online clustering method for online news event detection, which improves both the precision and recall by using time slicing and event merging respectively. The resulted events are further improved by an adaptive post-processing step which can automatically detect noisy events and further process them. Experiments on standard and real-world datasets show that our proposed adaptive online event detection method significantly improves the performance of event detection in terms of both efficiency and accuracy compared to state-of-the-art methods. (C) 2017 Elsevier B.V. All rights reserved.
机译:事件检测旨在发现针对同一事件进行报道的新闻文档,并将它们放在同一组中。随着在线新闻的爆炸性增长,需要进行事件检测以促进新闻空间中用户的更好导航。现有作品通常代表基于TF-IDF方案的文档,并使用聚类算法进行事件检测。但是,传统的TF-IDF向量表示存在高维和稀疏语义的问题。此外,随着更多新闻文件的到来,IDF需要逐步更新。在本文中,我们提出了一种基于词嵌入的新颖文档表示方法,与TF-IDF相比,它减小了维数并减轻了稀疏语义,从而提高了效率和准确性。基于文档表示,我们提出了一种用于新闻事件在线检测的自适应在线聚类方法,该方法通过分别使用时间切片和事件合并来提高准确性和召回率。通过自适应后处理步骤可以进一步改善所得事件,该步骤可以自动检测出嘈杂事件并对其进行进一步处理。在标准数据集和实际数据集上进行的实验表明,与最新方法相比,我们提出的自适应在线事件检测方法在效率和准确性方面均显着提高了事件检测的性能。 (C)2017 Elsevier B.V.保留所有权利。

著录项

  • 来源
    《Knowledge-Based Systems》 |2017年第15期|105-112|共8页
  • 作者单位

    Tsinghua Univ, Dept Comp Sci & Technol, Beijing 100084, Peoples R China;

    Tsinghua Univ, Dept Comp Sci & Technol, Beijing 100084, Peoples R China;

    Tsinghua Univ, Dept Comp Sci & Technol, Beijing 100084, Peoples R China;

    Tsinghua Univ, Dept Comp Sci & Technol, Beijing 100084, Peoples R China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Word embedding; Adaptive online clustering; Event detection;

    机译:词嵌入;自适应在线聚类;事件检测;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号