首页> 外文学位 >End-to-end Noncontiguous Access Pattern Optimization for Extreme-scale Scientific Data Analytics.
【24h】

End-to-end Noncontiguous Access Pattern Optimization for Extreme-scale Scientific Data Analytics.

机译:极端规模科学数据分析的端到端非连续访问模式优化。

获取原文
获取原文并翻译 | 示例

摘要

In high-performance computing (HPC) environments, numerous factors conspire to make efficiently accessing large-scale scientific data difficult: the continually growing imbalance between compute and I/O capabilities, the data intensive nature of current and future scientific simulations, the distribution of data among many discrete storage locations in parallel filesystems, and the complexity of data access patterns in I/O workloads.;In this thesis, we propose complementary approaches to reduce both the size and complexity of data accesses to parallel storage, focusing on post-data-generation analysis workloads (i.e., I/O read optimization) across the I/O software stack. Additionally, we develop methodologies to efficiently process noncontiguous data, a common occurrence in HPC data workloads, informed by current architectural trends (e.g., GPUs). For data reduction, we explore techniques of level-of-detail analysis, which aims to reduce read costs for data analysis at the cost of reduced analysis precision. While existing level-of-detail methods, such as hierarchical Z-order sampling and wavelet multiresolution analysis, have proven useful in a number of analysis tasks, they do not provide both hard bounds on data precision and full-context views of the data, both of which are essential for a robust level-of-detail methodology.;Based on these limitations, we present a precision-based level-of-detail methodology (APLOD) for scientific oating-point data, which utilizes the floating-point format to provide well-defined I/O-accuracy tradeoffs. Data layout complexity is reduced by a deterministic partitioning of data, with low computational overhead and bounded per-point errors.;The data processing required to implement APLOD induces noncontiguous access patterns, which are ubiquitous in scientific computing, commonly seen in array subvolume accesses, spatio-temporal accesses, etc. Processing these accesses, specifically for I/O, have been explored at various levels of the I/O software stack. For APLOD in particular, we have shown scalable integration with the ADIOS high-level I/O library.;At the middleware level, specifically MPI and MPI-IO, we show the ability to map the APLOD format and data transformations onto an MPI datatypes representation, which enables the MPI runtime to efficiently communicate noncontiguous data without user intervention. One significant use-case missing from such noncontiguous data processing is the ability to efficiently process data residing in graphics processing unit (GPU) memory. Hence, we develop an MPI datatypes processing algorithm optimized for GPU-resident data, utilizing the massively parallel nature of GPUs. We demonstrate low processing overhead for both regularly-structured data and irregular data, compared to methods based on PCIe direct memory access (DMA).;Finally, reducing the magnitude of accesses is but one piece of the puzzle--- complex, noncontiguous access patterns are exceptionally difficult to process effectively, especially at scale. With this problem in mind and spurred on by recent developments and optimization opportunities in HPC storage systems, we explore optimizations for complex access patterns, such as those exhibited by APLOD, in the space between middleware I/O drivers (e.g., MPI) and objectbased storage systems (e.g., PVFS). We do this by exploiting direct object-storage semantics to dynamically create partially replicated data optimized for differing access patterns, as well as by leveraging integrated I/O tracing and analysis to make intelligent decisions to drive our replica-based optimizations. Our method is shown to be effective at improving I/O performance for several common noncontiguous access workloads. (Abstract shortened by UMI.).
机译:在高性能计算(HPC)环境中,许多因素共同导致难以有效访问大规模科学数据:计算和I / O功能之间的不平衡持续增长,当前和未来科学模拟的数据密集性,并行文件系统中许多离散存储位置之间的数据,以及I / O工作负载中数据访问模式的复杂性。;本文,我们提出了互补的方法来减少对并行存储的数据访问的大小和复杂性,重点是后处理I / O软件堆栈中的数据生成分析工作负载(即I / O读取优化)。此外,我们还开发了一些方法来有效地处理非连续数据,这是HPC数据工作负载中的常见现象,这要归功于当前的架构趋势(例如GPU)。对于数据缩减,我们探索了详细程度分析的技术,该技术旨在以降低的分析精度为代价减少数据分析的读取成本。虽然现有的详细程度方法(例如分层Z阶采样和小波多分辨率分析)已被证明在许多分析任务中有用,但它们既没有提供数据精度的硬性限制,也没有提供数据的全上下文视图,基于这些限制,我们提出了一种基于精确度的细节水平方法(APLOD),用于科学的点数据,该方法利用浮点格式提供明确的I / O精度权衡。确定性的数据划分降低了数据布局的复杂性,具有较低的计算开销和有限制的每点错误。实现APLOD所需的数据处理会引起不连续的访问模式,这在科学计算中很普遍,在数组子卷访问中很常见,时空访问等。在I / O软件堆栈的各个级别上都研究了处理这些访问(特别是I / O)的方法。特别是对于APLOD,我们已经展示了与ADIOS高级I / O库的可扩展集成。;在中间件级别,特别是MPI和MPI-IO,我们展示了将APLOD格式和数据转换映射到MPI数据类型的能力表示,使MPI运行时可以有效地传达不连续的数据,而无需用户干预。这种不连续的数据处理缺少的一个重要用例是能够有效处理图形处理单元(GPU)内存中的数据的能力。因此,我们利用GPU的大规模并行特性,开发了针对GPU驻留数据进行了优化的MPI数据类型处理算法。与基于PCIe直接内存访问(DMA)的方法相比,我们证明了规则结构化数据和不规则数据的处理开销都很低;最后,减少访问量只是难题之一-复杂,不连续的访问模式很难有效地处理,特别是在规模上。考虑到这一问题并在HPC存储系统的最新发展和优化机会的刺激下,我们探索了中间件I / O驱动程序(例如MPI)和基于对象的空间之间的复杂访问模式(例如APLOD所展示的访问模式)的优化。存储系统(例如PVFS)。为此,我们利用直接的对象存储语义来动态创建针对不同访问模式优化的部分复制的数据,并利用集成的I / O跟踪和分析做出明智的决定来推动基于副本的优化。我们的方法显示出可以有效提高几种常见的非连续访问工作负载的I / O性能。 (摘要由UMI缩短。)。

著录项

  • 作者

    Jenkins, Jonathan Paul.;

  • 作者单位

    North Carolina State University.;

  • 授予单位 North Carolina State University.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2013
  • 页码 112 p.
  • 总页数 112
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号