首页> 外文期刊>Journal of Intelligent Information Systems >Change point detection for burst analysis from an observed information diffusion sequence of tweets
【24h】

Change point detection for burst analysis from an observed information diffusion sequence of tweets

机译:从观察到的推文信息扩散序列进行突发分析的变化点检测

获取原文
获取原文并翻译 | 示例
           

摘要

We propose a method of detecting the period in which a burst of information diffusion took place from an observed diffusion sequence data over a social network and report the results obtained by applying it to the real Twitter data. We assume a generic information diffusion model in which time delay associated with the diffusion follows the exponential distribution and the burst is directly reflected to the changes in the time delay parameter of the distribution. The shape of the parameter's change is approximated by a step function and the problem of detecting the change points and finding the values of the parameter is formulated as an optimization problem of maximizing the likelihood of generating the observed diffusion sequence. Time complexity of the search is almost proportional to the number of observed data points and has been shown to be very efficient. We first demonstrated that the proposed method can detect the burst using a synthetic data and showed that it performs better than one of the representative state-of-the-art methods, confirming that the proposed method covers a wider range of change patterns. Then, we extended our evaluation on synthetic data to show that it is efficient and effective comparing it with a naive exhaustive search and a simple greedy method. We then apply the method to the real Twitter data of the 2011 To-hoku earthquake and tsunami, and reconfirmed its efficiency and effectiveness. Two interesting discoveries are that a burst period detected by the proposed method tends to contain massive homogeneous tweets on a specific topic even if the observed diffusion sequence consists of heterogeneous tweets on various topics, and that assuming the information diffusion path to be a line shape tree can give a good approximation of the maximum likelihood estimator when the actual diffusion path is not known.
机译:我们提出了一种方法,该方法从社交网络上观察到的扩散序列数据中检测信息扩散爆发的时间段,并将通过将其应用于实际Twitter数据而报告的结果进行报告。我们假设一个通用的信息扩散模型,其中与扩散相关的时间延迟遵循指数分布,并且突发直接反映到分布的时间延迟参数的变化。通过阶跃函数来近似参数变化的形状,并且将检测变化点并找到参数值的问题表述为使生成观察到的扩散序列的可能性最大化的优化问题。搜索的时间复杂度几乎与观察到的数据点的数量成正比,并且已经证明非常有效。我们首先证明了所提出的方法可以使用合成数据检测突发,并表明它比代表性的最新技术具有更好的性能,从而证实了所提出的方法涵盖了更广泛的变化模式。然后,我们扩展了对综合数据的评估,以表明它与幼稚的穷举搜索和简单的贪婪方法相比较是有效的。然后,我们将该方法应用于2011年东北地震和海啸的真实Twitter数据,并再次确认了其有效性和有效性。两个有趣的发现是,即使观察到的扩散序列由各个主题上的异类推文组成,所提方法检测到的突发周期也倾向于包含特定主题上的大量均质推文,并且假设信息扩散路径为线形树当实际扩散路径未知时,可以给出最大似然估计的良好近似。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号