首页> 外文会议>Mexican international conference on artificial intelligence >Prediction of User Retweets Based on Social Neighborhood Information and Topic Modelling
【24h】

Prediction of User Retweets Based on Social Neighborhood Information and Topic Modelling

机译:基于社交邻域信息和主题建模的用户转发预测

获取原文

摘要

Twitter and other social networks have become a fundamental source of information and a powerful tool to spread ideas and opinions. A crucial step in understanding the mechanisms that drive information diffusion in Twitter, is to study the influence of the social neighborhood of a user in the construction of her retweeting preferences. In particular, to what extent can the preferences of a user be predicted given the preferences of her neighborhood. We build our own sample graph of Twitter users and study the problem of predicting retweets from a given user based on the retweeting behavior occurring in her second-degree social neighborhood (followed and followed-by-followed). We manage to train and evaluate user-centered binary classification models that predict retweets with an average F1 score of 87.6%, based purely on social information, that is, without analyzing the content of the tweets. For users getting low scores with such models (on a tuning dataset), we improve the results by adding features extracted from the content of tweets. To do so, we apply a Natural Language Processing (NLP) pipeline including a Twitter-specific adaptation of the Latent Dirichlet Allocation (LDA) probabilistic topic model.
机译:Twitter和其他社交网络已成为信息的基本来源和传播思想和观点的强大工具。理解推动Twitter中信息传播的机制的关键步骤是研究用户社交邻域在其转发偏好中的影响。特别地,给定其邻域的偏好,可以在多大程度上预测用户的偏好。我们建立了自己的Twitter用户样本图,并研究了根据其二级社交邻居中发生的转发行为(跟踪和跟踪)来预测给定用户转发的问题。我们设法训练和评估以用户为中心的二进制分类模型,该模型仅基于社交信息即不分析推文的内容即可预测平均F1分数为87.6%的转发。对于使用此类模型(在调整数据集上)得分较低的用户,我们通过添加从推文内容中提取的功能来改善结果。为此,我们应用了自然语言处理(NLP)管道,其中包括Twitter的潜在Dirichlet分配(LDA)概率主题模型的特定改编。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号