...
首页> 外文期刊>EPJ Data Science >Early detection of promoted campaigns on social media
【24h】

Early detection of promoted campaigns on social media

机译:早期发现社交媒体上促进活动的促销活动

获取原文
           

摘要

Social media expose millions of users every day to information campaigns?- some emerging organically from grassroots activity, others sustained by advertising or other coordinated efforts. These campaigns contribute to the shaping of collective opinions. While most information campaigns are benign, some may be deployed for nefarious purposes, including terrorist propaganda, political astroturf, and financial market manipulation. It is therefore important to be able to detect whether a meme is being artificially promoted at the very moment it becomes wildly popular. This problem has important social implications and poses numerous technical challenges. As a first step, here we focus on discriminating between trending memes that are either organic or promoted by means of advertisement. The classification is not trivial: ads cause bursts of attention that can be easily mistaken for those of organic trends. We designed a machine learning framework to classify memes that have been labeled as trending on Twitter. After trending, we can rely on a large volume of activity data. Early detection, occurring immediately at trending time, is a more challenging problem due to the minimal volume of activity data that is available prior to trending. Our supervised learning framework exploits hundreds of time-varying features to capture changing network and diffusion patterns, content and sentiment information, timing signals, and user meta-data. We explore different methods for encoding feature time series. Using millions of tweets containing trending hashtags, we achieve 75% AUC score for early detection, increasing to above 95% after trending. We evaluate the robustness of the algorithms by introducing random temporal shifts on the trend time series. Feature selection analysis reveals that content cues provide consistently useful signals; user features are more informative for early detection, while network and timing features are more helpful once more data is available.
机译:社交媒体每天揭示数百万用户的信息活动? - 有些人从基层活动中出现,其他人受到广告或其他协调努力的努力。这些竞选有助于集体意见的塑造。虽然大多数信息活动都是良​​性的,但有些可能部署用于邪恶的目的,包括恐怖主义宣传,政治Astroturf和金融市场操纵。因此,重要的是能够检测在狂野流行的那一刻是人为地促进的MEME是否正在促进。这个问题具有重要的社会影响,并提出了许多技术挑战。作为第一步,在这里,我们专注于通过广告方式促进有机或促进的趋势模因。分类并不琐碎:广告导致注意力突发,这可能很容易误认为是有机趋势。我们设计了一种机器学习框架,以对已标记为Twitter趋势的模型进行分类。在趋势之后,我们可以依赖大量的活动数据。早期检测立即发生在趋势时间时,由于在趋势之前的最小活动数据量最小的活性数据,是一种更具挑战性的问题。我们的监督学习框架利用数百个时变特征来捕获改变网络和扩散模式,内容和情绪信息,时序信号和用户元数据。我们探索了不同的编码特征时间序列方法。利用数百万含有趋势散布的推文,我们达到75%的AUC评分进行早期检测,在趋势后增加到95%以上。我们通过在趋势时间序列上引入随机时间换档来评估算法的鲁棒性。特征选择分析显示内容提示提供一致的有用信号;用户功能更丰富地进行早期检测,而网络和定时功能在更多数据可用时会更加乐于助彩。

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号