...
首页> 外文期刊>Concurrency and computation: practice and experience >An efficient list scheduling algorithm with task duplication for scientific big data workflow in heterogeneous computing environments
【24h】

An efficient list scheduling algorithm with task duplication for scientific big data workflow in heterogeneous computing environments

机译:一种有效的列表调度算法,在异构计算环境中进行科学大数据工作流程任务复制

获取原文
获取原文并翻译 | 示例
           

摘要

A high-performance heterogeneous computing environment provides computing resources for efficient computation of scientific Big Data workflow applications. A Big Data workflow application comprises thousands of interdependent tasks with precedence constraints. Basically, the performance of heterogeneous computing systems mainly depends on the workflow scheduling algorithms. These workflow scheduling algorithms are considered an NP-complete problem. In this article, a List Scheduling with Task Duplication (LSTD) algorithm is proposed that efficiently minimizes the makespan of workflow applications. The LSTD introduces task duplication strategy in the list scheduling algorithm without increasing the overall time complexity. The overall functionality of LSTD mainly consists of three phases. In the first phase, it calculates the rank of the tasks for deciding the scheduling order. The next step is responsible for duplicating the entry task on the processor only if it increases the overall efficiency and avoids processor overloading. Finally, in the last step, the processor is assigned to the tasks based on the popular insertion-based policy that attempts to insert the task among two earlier assigned tasks on a given processor in earliest idle time. In order to verify the usefulness of the proposed algorithm, several existing well-known algorithms, such as Heterogeneous Earliest Finish Time (HEFT), Critical Path on a Processor (CPOP), and Predict Earliest Finish Time (PEFT) are considered for comparison. A non-duplication version of LSTD named List Scheduling without Task Duplication algorithm is also considered for performance evaluation. The experimental analysis based on scientific Big Data workflows (CyberShake, Montage, and LIGO) proves that LSTD significantly surpasses all considered scheduling heuristics concerning schedule length ratio, the percentage of best results, and average running time metrics.
机译:高性能异构计算环境提供了用于高效计算科学大数据工作流程应用的计算资源。大数据工作流程应用程序包含具有优先约束的数千个相互依存的任务。基本上,异构计算系统的性能主要取决于工作流程调度算法。这些工作流程调度算法被认为是NP完整问题。在本文中,提出了一种列出任务复制(LSTD)算法的列表调度,从而有效地最小化了工作流应用程序的MapSpan。 LSTD在列表调度算法中引入了任务复制策略,而不会增加总时间复杂性。 LSTD的整体功能主要由三个阶段组成。在第一阶段,它计算用于决定调度顺序的任务的等级。下一步是才能在增加整体效率并避免处理器过载时重复处理器上的条目任务。最后,在最后一步中,处理器基于基于流行的插入的策略分配给任务,该策略在最早的空闲时间中尝试在给定的处理器上的两个前面分配的任务之间插入任务。为了验证所提出的算法的有用性,考虑了处理器(CPOP)上的几个现有众所周知的算法,例如异构最早的结束时间(HEFT),并且预测最早的结束时间(PEFT)。还考虑了没有任务复制算法的LSTD命名列表调度的非重复版本,用于性能评估。基于科学大数据工作流程(网络跳,蒙太奇和LIGO)的实验分析证明,LSTD显着超越了所有考虑的调度启发式的调度长度,最佳结果的百分比和平均运行时间指标。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号