首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >Work-Stealing Prefix Scan: Addressing Load Imbalance in Large-Scale Image Registration
【24h】

Work-Stealing Prefix Scan: Addressing Load Imbalance in Large-Scale Image Registration

机译:工作窃取前缀扫描:在大型图像配准中寻址负载不平衡

获取原文
获取原文并翻译 | 示例
           

摘要

Parallelism patterns (e.g., map or reduce) have proven to be effective tools for parallelizing high-performance applications. In this article, we study the recursive registration of a series of electron microscopy images - a time consuming and imbalanced computation necessary for nano-scale microscopy analysis. We show that by translating the image registration into a specific instance of the prefix scan, we can convert this seemingly sequential problem into a parallel computation that scales to over thousand of cores. We analyze a variety of scan algorithms that behave similarly for common low-compute operators and propose a novel work-stealing procedure for a hierarchical prefix scan. Our evaluation shows that by identifying a suitable and well-optimized prefix scan algorithm, we reduce time-to-solution on a series of 4,096 images spanning ten seconds of microscopy acquisition from over 10 hours to less than 3 minutes (using 1024 Intel Haswell cores), enabling derivation of material properties at nanoscale for long microscopy image series.
机译:已证明并行性模式(例如,映射或减少)是有效的用于并行化高性能应用的工具。在本文中,我们研究了一系列电子显微镜图像的递归登记 - 纳米显微镜分析所需的耗时和不平衡计算。我们表明,通过将图像注册转换为前缀扫描的特定实例,我们可以将此看似连续的问题转换为平行计算,该计算可扩展到超过千万的核心。我们分析了各种扫描算法,同样地表现出共同的低计算运算符,并提出了一种用于分层前缀扫描的新型工作窃取程序。我们的评价表明,通过识别合适且优化优化的前缀扫描算法,我们将在一系列4,096张图像上缩短时间的时间,从10小时到3分钟到不到3分钟(使用1024英特尔Haswell核心) ),使纳米尺度的材料特性能够实现长显微镜图像系列的衍生。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号