首页> 外文会议>International workshop on job scheduling strategies for parallel processing >Experience and Practice of Batch Scheduling on Leadership Supercomputers at Argonne
【24h】

Experience and Practice of Batch Scheduling on Leadership Supercomputers at Argonne

机译:领导力超级计算机上的批量计划的经验和实践

获取原文

摘要

The mission of the DOE Argonne Leadership Computing Facility (ALCF) is to accelerate major scientific discoveries and engineering breakthroughs for humanity by designing and providing world-leading computing facilities in partnership with the computational science community. The ALCF operates supercomputers that are generally amongst the Top 5 fastest machines in the world. Specifically, ALCF is looking for the science that is either too big to run anywhere else, or it would take so long as to be impractical (i.e., "capability jobs"). At ALCF, batch scheduling plays a critical role for achieving a set of site goals within a set of constraints. While system utilization is an important goal at ALCF, its largest mission constraint is to enable extreme scale parallel jobs to take precedence. In this paper, we will describe the specific scheduling goals and constraints, analyze the workload traces collected in 2013-2017 from the 48-rack petascale supercomputer Mira, and discuss the upcoming scheduling challenges at ALCF.
机译:DOE Argonne领导能力计算设施(ALCF)的任务是通过与计算科学界合作设计和提供世界领先的计算设施,来加速人类的重大科学发现和工程突破。 ALCF运营的超级计算机通常跻身世界前5名最快的计算机之列。具体而言,ALCF正在寻找的科学太大或者无法在其他任何地方运行,或者需要花费很长时间以至于不切实际(例如,“能力工作”)。在ALCF中,批处理调度对于在一组约束条件下实现一组站点目标至关重要。尽管系统利用率是ALCF的重要目标,但其最大的任务限制是使极端规模的并行作业优先。在本文中,我们将描述特定的调度目标和约束条件,分析2013-2017年从48机架petascale超级计算机Mira收集的工作量跟踪,并讨论ALCF即将面临的调度挑战。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号