首页> 外文会议>International Euro-Par Parallel Processing Conference; 20050830-0902; Lisbon(PT) >A Checkpoint/Recovery Model for Heterogeneous Dataflow Computations Using Work-Stealing
【24h】

A Checkpoint/Recovery Model for Heterogeneous Dataflow Computations Using Work-Stealing

机译:使用工作偷取进行异构数据流计算的检查点/恢复模型

获取原文
获取原文并翻译 | 示例

摘要

This paper presents a new checkpoint/recovery method for dataflow computations using work-stealing in heterogeneous environments as found in grid or cluster computing. Basing the state of the computation on a dynamic macro dataflow graph, it is shown that the mechanisms provide effective checkpointing for multithreaded applications in heterogeneous environments. Two methods, Systematic Event Logging and Theft-Induced Checkpointing, are presented that are efficient and extremely flexible under the system-state model, allowing for recovery on different platforms under different number of processors. A formal analysis of the overhead induced by both methods is presented, followed by an experimental evaluation in a large cluster. It is shown that both methods have very small overhead and that trade-offs between checkpointing and recovery cost can be controlled.
机译:本文提出了一种新的检查点/恢复方法,用于在网格或集群计算中发现的异构环境中使用工作窃取进行数据流计算。基于动态宏数据流图的计算状态,表明该机制为异构环境中的多线程应用程序提供了有效的检查点。提出了两种方法:系统事件日志记录和盗窃引发的检查点检查,它们在系统状态模型下既高效又极其灵活,可以在不同数量的处理器下的不同平台上进行恢复。对这两种方法引起的开销进行了形式化分析,然后在大型集群中进行了实验评估。结果表明,这两种方法的开销都非常小,并且可以控制检查点和恢复成本之间的折衷。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号