首页> 外文期刊>Information Processing & Management >Variance reduction in large graph sampling
【24h】

Variance reduction in large graph sampling

机译:大图采样中的方差减少

获取原文
获取原文并翻译 | 示例
           

摘要

The norm of practice in estimating graph properties is to use uniform random node (RN) samples whenever possible. Many graphs are large and scale-free, inducing large degree variance and estimator variance. This paper shows that random edge (RE) sampling and the corresponding harmonic mean estimator for average degree can reduce the estimation variance significantly. First, we demonstrate that the degree variance, and consequently the variance of the RN estimator, can grow almost linearly with data size for typical scale-free graphs. Then we prove that the RE estimator has a variance bounded from above. Therefore, the variance ratio between RN and RE samplings can be very large for big data. The analytical result is supported by both simulation studies and 18 real networks. We observe that the variance reduction ratio can be more than a hundred for some real networks such as Twitter. Furthermore, we show that random walk (RW) sampling is always worse than RE sampling, and it can reduce the variance of RN method only when its performance is close to that of RE sampling.
机译:估计图形属性的实践准则是尽可能使用统一的随机节点(RN)样本。许多图是大的且无标度,从而导致较大的方差和估计量方差。本文表明,随机边缘(RE)采样和相应的平均程度谐波均值估计器可以显着减少估计方差。首先,我们证明对于典型的无标度图,度方差以及相应的RN估计量方差可以随数据大小线性增长。然后,我们证明RE估计量的方差为上界。因此,对于大数据,RN和RE采样之间的方差比可能非常大。仿真研究和18个真实网络均支持分析结果。我们观察到,对于某些真实网络(例如Twitter),方差减少率可能会超过100。此外,我们表明随机游走(RW)采样总是比RE采样差,并且只有在其性能接近RE采样时,它才能减小RN方法的方差。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号