首页> 外文会议>International workshop on future and emergent trends in language technology >Towards a Topic Discovery and Tracking System with Application to News Items
【24h】

Towards a Topic Discovery and Tracking System with Application to News Items

机译:面向主题发现和跟踪系统的新闻项目应用

获取原文

摘要

Rapid proliferation of the World Wide Web led to an enormous increase in the availability of textual corpora. In this paper, the problem of topic detection and tracking is considered with application to news items. The proposed approach explores two algorithms (Non-Negative Matrix Factorization and a dynamic version of Latent Dirich-let Allocation (DLDA)) over discrete time steps and makes it possible to identify topics within storylines as they appear and track them through time. Moreover, emphasis is given to the visualization and interaction with the results through the implementation of a graphical tool (regardless the approach). Experimental analysis on Reuters RCV1 corpus and the Reuters 2015 archive reveals that explored approaches can be effectively used as tools for identifying topic appearances and their evolutions while at the same time allowing for an efficient visualization.
机译:万维网的迅速普及导致文本语料库的可用性大大增加。在本文中,考虑了主题检测和跟踪的问题,并将其应用于新闻项目。提出的方法探索了离散时间步长上的两种算法(非负矩阵因式分解和动态Dirich-let分配潜力(DLDA)),并使得能够在故事情节中识别主题并随着时间跟踪主题。此外,通过图形工具的实现(无论采用哪种方法)都将重点放在可视化和与结果的交互上。对Reuters RCV1语料库和Reuters 2015档案的实验分析表明,探索的方法可以有效地用作识别主题外观及其演变的工具,同时可以进行有效的可视化。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号