首页> 外文学位 >Algorithms for discovering communities in complex networks .
【24h】

Algorithms for discovering communities in complex networks .

机译:复杂网络中的社区发现算法。

获取原文
获取原文并翻译 | 示例

摘要

It has been observed that real-world random networks like the WWW, Internet, social networks, citation networks, etc., organize themselves into closely-knit groups that are locally dense and globally sparse . These closely-knit groups are termed communities. Nodes within a community are similar in some aspect. For example in a WWW network, communities might consist of web pages that share similar contents. Mining these communities facilitates better understanding of their evolution and topology, and is of great theoretical and commercial significance. Community related research has focused on two main problems: community discovery and community identification. Community discovery is the problem of extracting all the communities in a given network, whereas community identification is the problem of identifying the community, to which, a given set of nodes belong.; We make a comparative study of various existing community-discovery algorithms. We then propose a new algorithm based on bibliographic metrics, which addresses the drawbacks in existing approaches. Bibliographic metrics are used to study similarities between publications in a citation network. Our algorithm classifies nodes in the network based on the similarity of their neighborhoods. One of the drawbacks of the current community-discovery algorithms is their computational complexity. These algorithms do not scale up to the enormous size of the real-world networks. We propose a hash-table-based technique that helps us compute the bibliometric similarity between nodes in O( m Delta) time. Here m is the number of edges in the graph and Delta, the largest degree.; Next, we investigate different centrality metrics. Centrality metrics are used to portray the importance of a node in the network. We propose an algorithm that utilizes centrality metrics of the nodes to compute the importance of the edges in the network. Removal of the edges in ascending order of their importance breaks the network into components, each of which represent a community. We compare the performance of the algorithm on synthetic networks with a known community structure using several centrality metrics. Performance was measured as the percentage of nodes that were correctly classified.; As an illustration, we model the ucf.edu domain as a web graph and analyze the changes in its properties like densification power law, edge density, degree distribution, diameter, etc., over a five-year period. Our results show super-linear growth in the number of edges with time. We observe (and explain) that despite the increase in average degree of the nodes, the edge density decreases with time.
机译:已经观察到,诸如WWW,因特网,社交网络,引文网络等的现实世界随机网络将它们自身组织成局部密集且全球稀疏的紧密联系的组。这些紧密联系的群体称为社区。社区内的节点在某些方面是相似的。例如,在WWW网络中,社区可能由共享相似内容的网页组成。挖掘这些社区有助于更好地了解它们的演变和拓扑结构,并且具有重要的理论和商业意义。与社区相关的研究集中在两个主要问题上:社区发现和社区识别。社区发现是提取给定网络中所有社区的问题,而社区标识是标识给定节点集所属的社区的问题。我们对各种现有的社区发现算法进行了比较研究。然后,我们提出了一种基于书目度量的新算法,该算法解决了现有方法中的缺点。书目度量标准用于研究引用网络中出版物之间的相似性。我们的算法根据邻居的相似性对网络中的节点进行分类。当前的社区发现算法的缺点之一是它们的计算复杂性。这些算法无法扩展到现实网络的巨大规模。我们提出了一种基于哈希表的技术,该技术可帮助我们计算O(m Delta)时间内节点之间的文献计量相似度。其中m是图中的边数和最大程度的Delta。接下来,我们研究不同的集中度指标。集中度指标用于描述网络中节点的重要性。我们提出一种算法,该算法利用节点的中心性度量来计算网络中边缘的重要性。以重要性从高到低的顺序删除边缘会将网络分解为多个组件,每个组件代表一个社区。我们使用几种集中度指标来比较算法在具有已知社区结构的合成网络上的性能。性能以正确分类的节点的百分比来衡量。作为说明,我们将ucf.edu域建模为网络图,并分析了其属性在五年期间的变化,如密度幂律,边缘密度,度分布,直径等。我们的结果表明,边沿数量随时间呈超线性增长。我们观察(并解释),尽管节点的平均程度有所增加,但边缘密度却随时间降低。

著录项

  • 作者

    Balakrishnan, Hemant.;

  • 作者单位

    University of Central Florida.;

  • 授予单位 University of Central Florida.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2006
  • 页码 153 p.
  • 总页数 153
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 自动化技术、计算机技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号