首页> 外文会议>International workshop on languages and compilers for parallel computing >Low-Overhead Fault-Tolerance Support Using DISC Programming Model
【24h】

Low-Overhead Fault-Tolerance Support Using DISC Programming Model

机译:使用DISC编程模型的低开销容错支持

获取原文

摘要

DISC is a newly proposed parallel programming paradigm that models many classes of iterative scientific applications through specification of a domain and interactions among domain elements. Accompanied with an associated runtime, it hides the details of inter-process communication and work partitioning (including partitioning in the presence of heterogeneous processing elements) from the programmers. In this paper, we show how these abstractions, particularly the concepts of compute-function and computation-space objects, can be also used to leverage low-overhead fault-tolerance support. While computation-space objects enable automated application level checkpointing, replicated execution of compute-functions helps detect soft errors with low overheads. Experimental results show the effectiveness of the proposed solutions.
机译:DISC是一种新近提出的并行编程范例,通过领域规范和领域元素之间的交互来对许多类的迭代科学应用程序进行建模。伴随相关的运行时,它向程序员隐藏了进程间通信和工作分区(包括在存在异构处理元素的情况下进行分区)的详细信息。在本文中,我们展示了如何将这些抽象,尤其是计算功能和计算空间对象的概念,也可以用来利用低开销的容错支持。尽管计算空间对象启用了自动化的应用程序级别检查点,但重复执行计算功能有助于以较低的开销检测软错误。实验结果表明了所提出解决方案的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号