首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >iShuffle: Improving Hadoop Performance with Shuffle-on-Write
【24h】

iShuffle: Improving Hadoop Performance with Shuffle-on-Write

机译:iShuffle:通过写入随机播放提高Hadoop性能

获取原文
获取原文并翻译 | 示例
           

摘要

Hadoop is a popular implementation of the MapReduce framework for running data-intensive jobs on clusters of commodity servers. Shuffle, the all-to-all input data fetching phase between the map and reduce phase can significantly affect job performance. However, the shuffle phase and reduce phase are coupled together in Hadoop and the shuffle can only be performed by running the reduce tasks. This leaves the potential parallelism between multiple waves of map and reduce unexploited and resource wastage in multi-tenant Hadoop clusters, which significantly delays the completion of jobs in a multi-tenant Hadoop cluster. More importantly, Hadoop lacks the ability to schedule task efficiently and mitigate the data distribution skew among reduce tasks, which leads to further degradation of job performance. In this work, we propose to decouple shuffle from reduce tasks and convert it into a platform service provided by Hadoop. We present iShuffle, a user-transparent shuffle service that pro-actively pushes map output data to nodes via a novel shuffle-on-write operation and flexibly schedules reduce tasks considering workload balance. Experimental results with representative workloads and Facebook workload trace show that iShuffle reduces job completion time by as much as 29.6 and 34 percent in single-user and multi-user clusters, respectively.
机译:Hadoop是MapReduce框架的流行实现,用于在商品服务器集群上运行数据密集型作业。随机播放,映射和缩小阶段之间的所有输入数据获取阶段会严重影响作业性能。但是,混洗阶段和缩减阶段在Hadoop中耦合在一起,并且只能通过运行缩减任务来执行混洗。这在多波地图之间留出了潜在的并行性,并减少了多租户Hadoop集群中的未被利用和资源浪费,这大大延迟了多租户Hadoop集群中作业的完成。更重要的是,Hadoop缺乏有效安排任务的能力以及减轻精简任务之间的数据分配偏差的能力,这导致作业性能进一步下降。在这项工作中,我们建议将洗牌与简化任务分离,并将其转换为Hadoop提供的平台服务。我们介绍了iShuffle,这是一种用户透明的洗牌服务,可通过一种新颖的写时洗牌操作主动将地图输出数据推送到节点,并考虑工作负载平衡灵活地调度任务。具有代表性工作负载和Facebook工作负载跟踪的实验结果表明,iShuffle在单用户和多用户群集中分别将作业完成时间减少了多达29.6%和34%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号