...
首页> 外文期刊>Knowledge and Data Engineering, IEEE Transactions on >A Novel Pipeline Approach for Efficient Big Data Broadcasting
【24h】

A Novel Pipeline Approach for Efficient Big Data Broadcasting

机译:一种有效的大数据广播的新管道方法

获取原文
获取原文并翻译 | 示例
           

摘要

Big-data computing is a new critical challenge for the ICT industry. Engineers and researchers are dealing with data sets of petabyte scale in the cloud computing paradigm. Thus, the demand for building a service stack to distribute, manage, and process massive data sets has risen drastically. In this paper, we investigate the Big Data Broadcasting problem for a single source node to broadcast a big chunk of data to a set of nodes with the objective of minimizing the maximum completion time. These nodes may locate in the same datacenter or across geo-distributed datacenters. This problem is one of the fundamental problems in distributed computing and is known to be NP-hard in heterogeneous environments. We model the Big-data broadcasting problem into a LockStep Broadcast Tree (LSBT) problem. The main idea of the LSBT model is to define a basic unit of upload bandwidth, , such that a node with capacity broadcasts data to a set of children at the rate . Note that is a parameter to be optimized as part of the LSBT problem. We further divide the broadcast data into chunks. These data chunks can then be broadcast down the LSBT in a pipeline manner. In a homogeneous network environment in which each node has the same upload capacity , we show that the optimal uplink rate of LSBT is either or , whichever gives the smaller maximum completion time. For heterogeneous environments, we present an algorithm to select an optimal uplink rate and to construct an optimal LSBT. Numerical results show that our approach performs well with less maximum completion time and lower computational complexity than other efficient solutions in literature.
机译:大数据计算是ICT行业面临的新的严峻挑战。工程师和研究人员正在处理云计算范式中的PB级数据集。因此,对于构建用于分发,管理和处理海量数据集的服务堆栈的需求已急剧增加。在本文中,我们研究了大数据广播问题,即单个源节点将大量数据广播到一组节点,以最大程度地减少最大完成时间。这些节点可以位于相同的数据中心或跨地理分布的数据中心。这个问题是分布式计算中的基本问题之一,并且在异构环境中已知是NP难的。我们将大数据广播问题建模为LockStep广播树(LSBT)问题。 LSBT模型的主要思想是定义上传带宽的基本单位,以使具有容量的节点以速率向一组子节点广播数据。请注意,这是要作为LSBT问题的一部分进行优化的参数。我们进一步将广播数据分为多个块。这些数据块然后可以以流水线方式在LSBT下广播。在每个节点具有相同上载容量的同构网络环境中,我们显示LSBT的最佳上行速率为或,以最大完成时间较小为准。对于异构环境,我们提出一种算法,以选择最佳的上行链路速率并构建最佳的LSBT。数值结果表明,与文献中的其他有效解决方案相比,我们的方法执行效果好,最大完成时间更短,计算复杂度更低。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号