首页> 外文学位 >Transparent dataflow detection and use in workflow scheduling: Concurrency and deadlock avoidance.
【24h】

Transparent dataflow detection and use in workflow scheduling: Concurrency and deadlock avoidance.

机译:透明的数据流检测和在工作流调度中的使用:并发和避免死锁。

获取原文
获取原文并翻译 | 示例

摘要

In this thesis we demonstrate the value of dataflow information to improve makespan performance (i.e., time to complete a set of jobs) in batch-scheduled workloads. Novel mechanisms and policies are introduced to improve job concurrency (i.e., when resources are unlimited) and to reduce the impact of deadlock (i.e., when resources are constrained). Without dataflow information concurrency might be limited, even if resources are unlimited, and resource usage might be inefficient, even if resource utilization is superficially high. The key insight is that dataflow, unlike control-flow, makes it visible when resources can be deallocated or reallocated, which allows for a crucial distinction between active and inactive resource usage. Through a simulation study, we show that the benefits of dataflow information can be a reduced makespan of over a factor of 5, depending on the workload and available resources.;Despite a large body of research on dataflow, most high-performance computing (HPC) systems (e.g., clusters) are batch scheduled based on control-flow status quo. The lack of a simple way to obtain dataflow information and the lack of compelling policies to exploit datatflow may account for the control-flow status quo. Therefore, we describe a simple prototype for transparently gathering dataflow information (i.e., Workflow-aware File System (WaFS)) and several scheduling policies to exploit that knowledge for higher concurrency (e.g., Versioned Namespace (VNS), Overwrite-Safe Concurrency (OSC)) and for better deadlock handling (e.g., Dataflow Aggregate Requests (DAR), Dataflow Topological Ordering (DTO)). Notably, our simulations show how dataflow information allows our policies to have lower makespans than the classic banker's algorithm and Lang's algorithm.
机译:在本文中,我们演示了数据流信息对于提高批量调度工作负载中的makepan性能(即完成一组作业的时间)的价值。引入了新颖的机制和策略来提高作业并发性(即,当资源无限时)并减少死锁的影响(即,当资源受限时)。如果没有数据流,即使资源是无限的,信息并发也可能会受到限制,即使表面上的资源利用率很高,资源的使用也会效率低下。关键的见解是,与控制流不同,数据流使资源可以重新分配或重新分配时可见,从而可以区分活动和不活动资源的使用情况。通过仿真研究,我们表明数据流信息的优势可以将有效期缩短5倍以上,具体取决于工作量和可用资源。;尽管对数据流进行了大量研究,但大多数高性能计算(HPC) )系统(例如集群)是根据控制流现状进行批量调度的。缺少获取数据流信息的简单方法以及缺乏利用数据流的诱人策略可能是控制流现状的原因。因此,我们描述了一个简单的原型,用于透明地收集数据流信息(即,工作流感知文件系统(WaFS))和几种调度策略,以利用该知识实现更高的并发性(例如,版本化命名空间(VNS),覆盖安全并发性(OSC) ))和更好的死锁处理(例如,数据流聚合请求(DAR),数据流拓扑排序(DTO))。值得注意的是,我们的模拟表明,数据流信息如何使我们的策略的有效期比经典的庄家算法和Lang算法低。

著录项

  • 作者

    Wang, Yang.;

  • 作者单位

    University of Alberta (Canada).;

  • 授予单位 University of Alberta (Canada).;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2008
  • 页码 156 p.
  • 总页数 156
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 自动化技术、计算机技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号