首页> 外文期刊>Information Processing & Management >Unsupervised adaptive microblog filtering for broad dynamic topics
【24h】

Unsupervised adaptive microblog filtering for broad dynamic topics

机译:适用于广泛动态主题的无监督自适应微博过滤

获取原文
获取原文并翻译 | 示例
           

摘要

Information filtering has been a major task of study in the field of information retrieval (IR) for a long time, focusing on filtering well-formed documents such as news articles. Recently, more interest was directed towards applying filtering tasks to user-generated content such as microblogs. Several earlier studies investigated microblog filtering for focused topics. Another vital filtering scenario in microblogs targets the detection of posts that are relevant to long-standing broad and dynamic topics, i.e., topics spanning several subtopics that change over time. This type of filtering in microblogs is essential for many applications such as social studies on large events and news tracking of temporal topics. In this paper, we introduce an adaptive microblog filtering task that focuses on tracking topics of broad and dynamic nature. We propose an entirely-unsupervised approach that adapts to new aspects of the topic to retrieve relevant microblogs. We evaluated our filtering approach using 6 broad topics, each tested on 4 different time periods over 4 months. Experimental results showed that, on average, our approach achieved 84% increase in recall relative to the baseline approach, while maintaining an acceptable precision that showed a drop of about 8%. Our filtering method is currently implemented on TweetMogaz, a news portal generated from tweets. The website compiles the stream of Arabic tweets and detects the relevant tweets to different regions in the Middle East to be presented in the form of comprehensive reports that include top stories and news in each region.
机译:长期以来,信息过滤一直是信息检索(IR)领域的一项主要研究任务,重点是过滤格式良好的文档(如新闻报道)。近来,更多的兴趣转向将过滤任务应用于用户生成的内容(例如微博客)。一些较早的研究调查了针对主题的微博过滤。微博中另一个重要的过滤方案是检测与长期存在的广泛而动态的主题相关的帖子,即,主题跨越多个随时间变化的子主题。微博中的此类过滤对于许多应用来说都是必不可少的,例如大型事件的社会研究和对时事新闻的跟踪。在本文中,我们介绍了一种自适应微博过滤任务,该任务专注于跟踪广泛而动态的主题。我们提出一种完全不受监督的方法,以适应主题的新方面来检索相关的微博。我们使用6个广泛的主题评估了我们的过滤方法,每个主题在4个月的4个不同时间段内进行了测试。实验结果表明,相对于基线方法,我们的方法平均使召回率提高了84%,同时保持可接受的精度,下降了约8%。我们的过滤方法当前在TweetMogaz上实现,TweetMogaz是根据tweet生成的新闻门户。该网站汇集了阿拉伯语推文流,并检测到中东不同地区的相关推文,以综合报告的形式呈现,其中包括每个地区的热门新闻和新闻。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号