...
首页> 外文期刊>Quality Control and Applied Statistics >Targeted sampling from massive block model graphs with personalized PageRank
【24h】

Targeted sampling from massive block model graphs with personalized PageRank

机译:Targeted sampling from massive block model graphs with personalized PageRank

获取原文
获取原文并翻译 | 示例
           

摘要

In many situations, the target population may be a small community within a massive graph such as a Twitter friendship graph discussing a particular current topic. Search has to be done on the entire network to select a small group of relevance, or targeted members can be found through link tracing. Graph connections are informative to identify community membership. The sampling starts from a seed node that belongs to the target community. A personalized page rank (PPR) can be considered as an alternative to snowball sampling which is the most commonly used method. For some d ≥ 0, snowball sampling collects all friends who are d friends away from the seed. This approach has two drawbacks compared to PPR. First, it does not account for the density of common friendship. Second, the snowball sample size increases fast with d. PPR on the other hand, gives a sample around the seed. The PPR vector is defined as the stationary distribution of personalized random walk. At each step, the random walker moves to the seed node with probability a and goes to the adjacent node with probability 1 - α. Consider a stationary distribution giving inclusion probability for sample size 1. This is a PPR vector. PPR leads to a cluster which is made up of nodes with large inclusion probabilities. To approximate the PPR vector quickly, Berkhin (Ref. 1) proposed an algorithm that examines only nodes with large inclusion probabilities. PPR is also computationally efficient based on the running time and data volume.

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号