首页> 外文会议>41st International Conference on Parallel Processing. >MLOC: Multi-level Layout Optimization Framework for Compressed Scientific Data Exploration with Heterogeneous Access Patterns
【24h】

MLOC: Multi-level Layout Optimization Framework for Compressed Scientific Data Exploration with Heterogeneous Access Patterns

机译:MLOC:具有异构访问模式的压缩科学数据探索的多层布局优化框架

获取原文
获取原文并翻译 | 示例

摘要

The size and scope of cutting-edge scientific simulations are growing much faster than the I/O and storage capabilities of their runtime environments. The growing gap gets exacerbated by exploratory dataâ"intensive analytics, such as querying simulation data for regions of interest with multivariate, spatio-temporal constraints. Query-driven data exploration induces heterogeneous access patterns that further stress the performance of the underlying storage system. To partially alleviate the problem, data reduction via compression and multi-resolution data extraction are becoming an integral part of I/O systems. While addressing the data size issue, these techniques introduce yet another mix of access patterns to a heterogeneous set of possibilities. Moreover, how extreme-scale datasets are partitioned into multiple files and organized on a parallel file systems augments to an already combinatorial space of possible access patterns. To address this challenge, we present MLOC, a parallel Multilevel Layout Optimization framework for Compressed scientific spatio-temporal data at extreme scale. MLOC proposes multiple fine-grained data layout optimization kernels that form a generic core from which a broader constellation of such kernels can be organically consolidated to enable an effective data exploration with various combinations of access patterns. Specifically, the kernels are optimized for access patterns induced by (a) queryâ"driven multivariate, spatio-temporal constraints, (b) precisionâ"driven data analytics, (c) compressionâ"driven data reduction, (d) multi-resolution data sampling, and (e) multiâ"file data partitioning and organization on a parallel file system. MLOC organizes these optimization kernels within a multiâ"level architecture, on which all the levels can be flexibly re-ordered by userâ"defined priorities. Whe- tested on queryâ"driven exploration of compressed data, MLOC demonstrates a superior performance compared to any state-of-the-art scientific database management technologies.
机译:尖端科学仿真的规模和范围的增长速度远远超过其运行时环境的I / O和存储功能。探索性数据的密集型分析加剧了不断扩大的差距,例如,查询模拟数据中具有多变量,时空约束的目标区域。查询驱动的数据探索引发了异构访问模式,这进一步加重了底层存储系统的性能。通过缓解数据压缩和多分辨率数据提取已成为I / O系统不可或缺的一部分,这些技术在解决数据大小问题的同时,又将访问模式的另一种混合引入了各种可能性。 ,如何将极端规模的数据集划分为多个文件并在并行文件系统上进行组织,从而增加了可能访问模式的组合空间,为解决这一挑战,我们提出了MLOC,这是一种用于压缩科学时空的并行多级布局优化框架。 MLOC提出了多个细粒度的数据层形成了通用内核的优化内核,可以将这些内核有机地合并在一起,从而以各种访问模式组合实现有效的数据探索。具体而言,针对由(a)查询“驱动的多元,时空约束,(b)精度”驱动的数据分析,(c)压缩”驱动的数据缩减,(d)多分辨率数据采样引起的访问模式进行了优化。 ,以及(e)在并行文件系统上进行多文件数据分区和组织。 MLOC将这些优化内核组织在一个多层次的体系结构中,在该体系结构上,可以根据用户定义的优先级灵活地对所有层次进行重新排序。经过对查询驱动的压缩数据探索的测试,MLOC证明了与任何最新科学数据库管理技术相比都具有的卓越性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号