An efficient list scheduling algorithm with task duplication for scientific big data workflow in heterogeneous computing environments

Ahmad Wakar; Alam Bashir

首页> 外文期刊>Concurrency and computation: practice and experience >An efficient list scheduling algorithm with task duplication for scientific big data workflow in heterogeneous computing environments

【24h】

An efficient list scheduling algorithm with task duplication for scientific big data workflow in heterogeneous computing environments

机译：一种有效的列表调度算法，在异构计算环境中进行科学大数据工作流程任务复制

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

A high-performance heterogeneous computing environment provides computing resources for efficient computation of scientific Big Data workflow applications. A Big Data workflow application comprises thousands of interdependent tasks with precedence constraints. Basically, the performance of heterogeneous computing systems mainly depends on the workflow scheduling algorithms. These workflow scheduling algorithms are considered an NP-complete problem. In this article, a List Scheduling with Task Duplication (LSTD) algorithm is proposed that efficiently minimizes the makespan of workflow applications. The LSTD introduces task duplication strategy in the list scheduling algorithm without increasing the overall time complexity. The overall functionality of LSTD mainly consists of three phases. In the first phase, it calculates the rank of the tasks for deciding the scheduling order. The next step is responsible for duplicating the entry task on the processor only if it increases the overall efficiency and avoids processor overloading. Finally, in the last step, the processor is assigned to the tasks based on the popular insertion-based policy that attempts to insert the task among two earlier assigned tasks on a given processor in earliest idle time. In order to verify the usefulness of the proposed algorithm, several existing well-known algorithms, such as Heterogeneous Earliest Finish Time (HEFT), Critical Path on a Processor (CPOP), and Predict Earliest Finish Time (PEFT) are considered for comparison. A non-duplication version of LSTD named List Scheduling without Task Duplication algorithm is also considered for performance evaluation. The experimental analysis based on scientific Big Data workflows (CyberShake, Montage, and LIGO) proves that LSTD significantly surpasses all considered scheduling heuristics concerning schedule length ratio, the percentage of best results, and average running time metrics.

机译：高性能异构计算环境提供了用于高效计算科学大数据工作流程应用的计算资源。大数据工作流程应用程序包含具有优先约束的数千个相互依存的任务。基本上，异构计算系统的性能主要取决于工作流程调度算法。这些工作流程调度算法被认为是NP完整问题。在本文中，提出了一种列出任务复制（LSTD）算法的列表调度，从而有效地最小化了工作流应用程序的MapSpan。 LSTD在列表调度算法中引入了任务复制策略，而不会增加总时间复杂性。 LSTD的整体功能主要由三个阶段组成。在第一阶段，它计算用于决定调度顺序的任务的等级。下一步是才能在增加整体效率并避免处理器过载时重复处理器上的条目任务。最后，在最后一步中，处理器基于基于流行的插入的策略分配给任务，该策略在最早的空闲时间中尝试在给定的处理器上的两个前面分配的任务之间插入任务。为了验证所提出的算法的有用性，考虑了处理器（CPOP）上的几个现有众所周知的算法，例如异构最早的结束时间（HEFT），并且预测最早的结束时间（PEFT）。还考虑了没有任务复制算法的LSTD命名列表调度的非重复版本，用于性能评估。基于科学大数据工作流程（网络跳，蒙太奇和LIGO）的实验分析证明，LSTD显着超越了所有考虑的调度启发式的调度长度，最佳结果的百分比和平均运行时间指标。

著录项

来源
《Concurrency and computation: practice and experience》 |2021年第5期|e5987.1-e5987.18|共18页
作者
Ahmad Wakar; Alam Bashir;
展开▼
作者单位

Jamia Millia Islamia Dept Comp Engn New Delhi India;

Jamia Millia Islamia Dept Comp Engn New Delhi India;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
big data workflows; heterogeneous computing systems; list scheduling; task duplication scheduling;

机译：大数据工作流;异构计算系统;列表调度;任务复制调度;

相似文献

外文文献
中文文献
专利

1. A task scheduling algorithm based on priority list and task duplication in cloud computing environment [J] . Geng Xiaozhong, Yu Lan, Bao Jie, Web Intelligence and Agent Systems . 2019,第2期

机译：云计算环境中基于优先级列表和任务重复的任务调度算法
2. ECOS: An efficient task-clustering based cost-effective aware scheduling algorithm for scientific workflows execution on heterogeneous cloud systems [J] . Dong Minggang, Fan Lili, Jing Chao The Journal of Systems and Software . 2019,第Deca期

机译：ECOS：一种高效的基于任务聚类的具有成本效益的调度算法，用于在异构云系统上执行科学的工作流
3. Efficient scientific workflow scheduling for deadline-constrained parallel tasks in cloud computing environments [J] . Information Sciences: An International Journal . 2020,第期

机译：云计算环境中截止日期约束的平行任务的高效科学工作流程
4. Task duplication-based workflow scheduling for heterogeneous cloud environment [C] . Indrajeet Gupta, Madhu Sudan Kumar, Prasanta K. Jana International Conference on Contemporary Computing . 2016

机译：异构云环境中基于任务复制的工作流调度
5. Algorithms for task scheduling in heterogeneous computing environments . [D] . Sai Ranga, Prashanth C. 2006

机译：异构计算环境下的任务调度算法。
6. Cancer Diagnosis Epigenomics Scientific Workflow Scheduling in the Cloud Computing Environment Using an Improved PSO Algorithm [O] . Sadhasivam N, Balamurugan R, Pandi M 2018

机译：使用改进的PSO算法在云计算环境中进行癌症诊断表基因组学科学工作流程调度
7. A New Duplication Task Scheduling Algorithm in Heterogeneous Distributed Computing Systems [O] . Aida A Nasr, Nirmeen A EL-Bahnasawy, Ayman EL-Sayed 2016

机译：异构分布式计算系统中的一种新的重复任务调度算法

An efficient list scheduling algorithm with task duplication for scientific big data workflow in heterogeneous computing environments

摘要

著录项

相似文献

相关主题

期刊订阅