首页> 外文期刊>International Journal of High Performance Computing Applications >Silent error detection in numerical time-stepping schemes
【24h】

Silent error detection in numerical time-stepping schemes

机译:数值时间步进方案中的静默错误检测

获取原文
获取原文并翻译 | 示例
           

摘要

Errors due to hardware or low-level software problems, if detected, can be fixed by various schemes, such as recomputation from a checkpoint. Silent errors are errors in application state that have escaped low-level error detection. At extreme scale, where machines can perform astronomically many operations per second, silent errors threaten the validity of computed results. We propose a new paradigm for detecting silent errors at the application level. Our central idea is to frequently compare computed values to those provided by a cheap checking computation, and to build error detectors based on the difference between the two output sequences. Numerical analysis provides us with usable checking computations for the solution of initial-value problems in ODEs and PDEs, arguably the most common problems in computational science. Here, we provide, optimize, and test methods based on Runge-Kutta and linear multistep methods for ODEs, and on implicit and explicit finite difference schemes for PDEs. We take the heat equation and Navier-Stokes equations as examples. In tests with artificially injected errors, this approach effectively detects almost all meaningful errors, without significant slowdown.
机译:如果检测到由硬件或低级软件问题引起的错误,则可以通过各种方案来解决该错误,例如从检查点重新计算。静默错误是应用程序状态中的错误,这些错误已避开了低级错误检测。在极端规模下,机器每秒可以执行许多天文运算,而无声错误则威胁到计算结果的有效性。我们提出了一种在应用程序级别检测静默错误的新范例。我们的中心思想是经常将计算值与廉价检查计算提供的值进行比较,并根据两个输出序列之间的差异来构建错误检测器。数值分析为我们提供了可用于ODE和PDE初值问题解决方案的可用检查计算方法,可以说是计算科学中最常见的问题。在此,我们提供基于Runge-Kutta和ODE线性多步方法,以及基于PDE的隐式和显式有限差分方案的,优化和测试方法。我们以热方程和Navier-Stokes方程为例。在具有人工注入错误的测试中,此方法可有效检测几乎所有有意义的错误,而不会显着降低速度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号