首页> 外文学位 >Exact floating-point summation and its application in shadowing.
【24h】

Exact floating-point summation and its application in shadowing.

机译:精确的浮点求和及其在阴影中的应用。

获取原文
获取原文并翻译 | 示例

摘要

The summation of n floating-point numbers is a basic task that is performed in almost any scientific or numerical computation. The relative error in the "naive loop" summation can be huge. We present three new algorithms for computing correctly rounded sums of arrays of floating-point numbers. First, iFastSum improves upon our previous FastSum by requiring no additional space beyond the original array, which is destroyed after execution. It runs about 20% faster than FastSum in the general case and two times faster when ill-conditioned data are used. The second algorithm is HybridSum, which combines three summation ideas together: splitting the mantissa, radix sorting into buckets, and using iFastSum. The result is that when the number of summands is greater than about 104, for a given n its running time is almost a constant, independent of the condition number. It runs almost as fast as iFastSum in the general case and much faster than iFastSum when ill-conditioned data are used. HybridSum requires only one pass through the input array, asymptotically uses only 6 FLOPs for each summand, and uses constant storage. Instead of splitting the mantissa in HybridSum, our third algorithm OnlineExactSum doubles the number of buckets and uses the exact addition algorithm to accumulate the summands while the local errors are handled by the additional buckets. It asymptotically requires only 5 FLOPs per summand, and due to instruction-level parallelism runs only about 2-3 times slower than the obvious, fast-but-dumb "naive loop", independent of the condition number. Like HybridSum, it is also an online algorithm, which can take an arbitrary length input stream of such inputs while requiring only constant memory. None of these algorithms requires extra precision accumulators, and all work in any base. Their accuracy is guaranteed independent of the condition number and the number of summands. An application in shadowing of high dimensional n-body simulations is discussed with new optimizations and improvements made to an existing shadowing algorithm. We also present a large amount of numerical results about the shadowing work we have done on the n-body systems. Although not all of those results are understood thoroughly so far, they might be useful for other physicists and astronomers.
机译:n个浮点数的总和是在几乎任何科学或数值计算中执行的基本任务。 “天真循环”求和中的相对误差可能很大。我们提出了三种新算法,用于计算正确舍入的浮点数数组的和。首先,iFastSum对我们之前的FastSum进行了改进,因为不需要额外的空间来存储原始数组,该数组在执行后会被销毁。在一般情况下,它的运行速度比FastSum快20%,而在使用条件不佳的数据时,运行速度快两倍。第二种算法是HybridSum,它将三个求和思想结合在一起:分割尾数,将基数排序到存储桶中以及使用iFastSum。结果是,当求和数大于104时,对于给定的n,其运行时间几乎是一个常数,与条件数无关。在一般情况下,它的运行速度几乎与iFastSum一样快,并且在使用病态数据时,其运行速度比iFastSum快得多。 HybridSum只需要遍历输入数组一次,对于每个被求和者渐进地只使用6个FLOP,并使用常量存储。我们的第三种算法OnlineExactSum不会在HybridSum中拆分尾数,而是将存储桶数加倍,并使用精确的加法算法累加求和数,而由其他存储桶处理局部错误。渐近地,每个求和仅需要5个FLOP,并且由于指令级别的并行性仅比明显的,快速但愚蠢的“幼稚的循环”慢了大约2-3倍,而与条件数无关。像HybridSum一样,它也是一种在线算法,它可以接受此类输入的任意长度的输入流,而只需要恒定的内存。这些算法都不要求额外的精度累加器,所有算法都可以在任何基础上工作。保证它们的准确性,而与条件数和被请求数无关。通过对现有阴影算法的新的优化和改进,讨论了在高维n体模拟的阴影中的应用。我们还提供了大量关于在n体系统上进行的遮蔽工作的数值结果。尽管到目前为止尚未完全理解所有这些结果,但它们可能对其他物理学家和天文学家有用。

著录项

  • 作者

    Zhu, Yong-Kang.;

  • 作者单位

    University of California, Irvine.;

  • 授予单位 University of California, Irvine.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2009
  • 页码 91 p.
  • 总页数 91
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

  • 入库时间 2022-08-17 11:38:26

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号