首页> 外文会议>Natural language understanding and intelligent applications >Finding the True Crowds: User Filtering in Microblogs
【24h】

Finding the True Crowds: User Filtering in Microblogs

机译:寻找真正的人群:微博中的用户过滤

获取原文
获取原文并翻译 | 示例

摘要

Nowadays users like to share their opinions towards a product/service or policy in social media, which is important to the manufacturers and governments to collect feedbacks from the crowds. While in microblogs, information is highly unbalanced that lots of posts are published and spread by ghost-writers/ spammers, sellers, official accounts, etc., but information provided by the true crowds is overwhelmed frequently. Previous studies mostly concern on how to find one specific type of users; but do not investigate how to filter multiple types of specific users so as to keep only the true crowds, which is the main topic of this work. In this paper, we first show the categorization on four different types of users, namely ghost-writers, sellers, official accounts and end-users (the former three are noted as a broad sense advertisers in the paper), and study their characteristics. Then we propose a Topic-Specific Divergence based model to filter out advertisers so that end-users can be kept. Meta-information, content are investigated in comparative analysis. Encouraging experimental results on real dataset clearly verify that the proposed approach outperforms the state-of-art methods significantly.
机译:如今,用户喜欢在社交媒体上分享他们对产品/服务或政策的观点,这对于制造商和政府从人群中收集反馈很重要。在微博中,信息非常不平衡,很多帖子是由幽灵编写者/垃圾邮件发送者,卖家,官方帐户等发布和散布的,但真正的人群提供的信息经常不知所措。先前的研究主要关注如何找到一种特定类型的用户。但不要研究如何过滤多种类型的特定用户,以便仅保留真正的人群,这是这项工作的主题。在本文中,我们首先显示四种不同类型的用户的分类,即幽灵编写者,卖方,官方帐户和最终用户(在本文中,前三种被称为广义广告商),并研究其特征。然后,我们提出了一种基于主题特定差异的模型,以过滤出广告客户,从而可以保留最终用户。在比较分析中调查了元信息,内容。在真实数据集上令人鼓舞的实验结果清楚地证明了所提出的方法明显优于最新方法。

著录项

  • 来源
  • 会议地点 Kunming(CN)
  • 作者单位

    Tsinghua National Laboratory for Information Science and Technology, Department of Computer Science and Technology, Tsinghua University, Beijing, China;

    Tsinghua National Laboratory for Information Science and Technology, Department of Computer Science and Technology, Tsinghua University, Beijing, China;

    Tsinghua National Laboratory for Information Science and Technology, Department of Computer Science and Technology, Tsinghua University, Beijing, China;

    Samsung RD Institute, Beijing, China;

    Tsinghua National Laboratory for Information Science and Technology, Department of Computer Science and Technology, Tsinghua University, Beijing, China;

    Tsinghua National Laboratory for Information Science and Technology, Department of Computer Science and Technology, Tsinghua University, Beijing, China;

    Samsung RD Institute, Beijing, China;

    Samsung RD Institute, Beijing, China;

  • 会议组织
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    User filtering; Topic-Specific Divergence;

    机译:用户过滤;特定主题的分歧;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号