首页> 中文期刊> 《计算机科学与探索》 >大规模多维网络数据分析框架的研究与实现

大规模多维网络数据分析框架的研究与实现

         

摘要

随着互联网的快速发展和计算机应用的不断增加,大量的图数据特别是社会网络数据不断生成.多维信息网络已经成为表示这些图数据的通用方式.但是在多维信息网络中,节点的类型多种多样,节点的属性也不尽相同,如何对多维信息网络数据进行多角度多粒度的分析,挖掘其中的隐藏信息,成为人们关注的焦点.图联机分析处理技术(graph online analytical processing,GraphOLAP)可以对图数据进行快速的联机分析以及查询操作.借助于GraphOLAP的现有成果,针对多维信息网络的特点,提出了新的数据立方体框架.引入主节点的概念来指导元路径的生成,通过元路径指导网络的上卷下钻,提出属性转化和同质转化来丰富OLAP操作.最后讨论了优化的物化策略,使用并行计算框架Spark来实现算法,通过多个数据集验证了框架的有效性和高效性.%With the rapid development of the Internet and the increasing of computer applications, a large number of graph data especially social networks are generated. Multi-dimensional information networks have become a com-mon way to represent these data. However in the multi-dimensional information networks there are multiple types of nodes and attributes. How to process the analysis of multi-view and multi-granularity and mine the hidden infor-mation has become the focus of current research. Graph online analytical processing (GraphOLAP) can process a quick online analysis and query operation of graph data. With the existing achievement of GraphOLAP, this paper proposes a new Graph-Cube framework according to the characteristics of multi-dimensional information network. This paper introduces the concept of meta-path and uses main node to guide the aggregation of the meta-path. Then this paper uses meta-path to guide the roll-up/drill-down operation of the network and proposes attributes transform and homogeneous transform operation of the Graph-Cube. Finally, this paper discusses the materialization strategy and implements the framework in Spark. The experimental results on real and simulation datasets prove the efficiency and effectiveness of the proposed framework.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号