首页> 外文期刊>Parallel and Distributed Systems, IEEE Transactions on >Load Rebalancing for Distributed File Systems in Clouds
【24h】

Load Rebalancing for Distributed File Systems in Clouds

机译:云中分布式文件系统的负载平衡

获取原文
获取原文并翻译 | 示例
           

摘要

Distributed file systems are key building blocks for cloud computing applications based on the MapReduce programming paradigm. In such file systems, nodes simultaneously serve computing and storage functions; a file is partitioned into a number of chunks allocated in distinct nodes so that MapReduce tasks can be performed in parallel over the nodes. However, in a cloud computing environment, failure is the norm, and nodes may be upgraded, replaced, and added in the system. Files can also be dynamically created, deleted, and appended. This results in load imbalance in a distributed file system; that is, the file chunks are not distributed as uniformly as possible among the nodes. Emerging distributed file systems in production systems strongly depend on a central node for chunk reallocation. This dependence is clearly inadequate in a large-scale, failure-prone environment because the central load balancer is put under considerable workload that is linearly scaled with the system size, and may thus become the performance bottleneck and the single point of failure. In this paper, a fully distributed load rebalancing algorithm is presented to cope with the load imbalance problem. Our algorithm is compared against a centralized approach in a production system and a competing distributed solution presented in the literature. The simulation results indicate that our proposal is comparable with the existing centralized approach and considerably outperforms the prior distributed algorithm in terms of load imbalance factor, movement cost, and algorithmic overhead. The performance of our proposal implemented in the Hadoop distributed file system is further investigated in a cluster environment.
机译:分布式文件系统是基于MapReduce编程范例的云计算应用程序的关键构建块。在这样的文件系统中,节点同时充当计算和存储功能。文件被划分为多个块,这些块分配在不同的节点中,以便可以在节点上并行执行MapReduce任务。但是,在云计算环境中,故障是常态,并且可以在系统中升级,替换和添加节点。还可以动态创建,删除和附加文件。这导致分布式文件系统中的负载不平衡;也就是说,文件块没有在节点之间尽可能均匀地分布。生产系统中新兴的分布式文件系统在很大程度上依赖于中央节点来进行块重分配。在大型的,容易发生故障的环境中,这种依赖性显然不足,因为中央负载均衡器处于相当大的工作负载中,该工作负载与系统大小成线性比例,因此可能成为性能瓶颈和单点故障。本文提出了一种完全分布式的负载平衡算法来解决负载不平衡问题。我们的算法与生产系统中的集中式方法和文献中提出的竞争性分布式解决方案进行了比较。仿真结果表明,我们的建议与现有的集中式方法具有可比性,并且在负载不平衡因素,移动成本和算法开销方面均明显优于现有的分布式算法。我们在群集环境中进一步研究了在Hadoop分布式文件系统中实现的建议的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号