A Tale of Two Data-Intensive Paradigms: Applications, Abstractions, and Architectures

机译：两个数据密集型范例的故事：应用程序，抽象和体系结构

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Scientific problems that depend on processing largeamounts of data require overcoming challenges in multiple areas:managing large-scale data distribution, co-placement andscheduling of data with compute resources, and storing and transferringlarge volumes of data. We analyze the ecosystems of thetwo prominent paradigms for data-intensive applications, hereafterreferred to as the high-performance computing and theApache-Hadoop paradigm. We propose a basis, common terminologyand functional factors upon which to analyze the two approachesof both paradigms. We discuss the concept of "Big DataOgres" and their facets as means of understanding and characterizingthe most common application workloads found acrossthe two paradigms. We then discuss the salient features of thetwo paradigms, and compare and contrast the two approaches.Specifically, we examine common implementation/approaches ofthese paradigms, shed light upon the reasons for their current"architecture" and discuss some typical workloads that utilizethem. In spite of the significant software distinctions, we believethere is architectural similarity. We discuss the potential integrationof different implementations, across the different levelsand components. Our comparison progresses from a fully qualitativeexamination of the two paradigms, to a semi-quantitativemethodology. We use a simple and broadly used Ogre (K-meansclustering), characterize its performance on a range of representativeplatforms, covering several implementations from bothparadigms. Our experiments provide an insight into the relativestrengths of the two paradigms. We propose that the set of Ogreswill serve as a benchmark to evaluate the two paradigms alongdifferent dimensions.

机译：依赖于处理数据的科学问题需要克服多个方面的挑战：管理具有计算资源的大规模数据分布，共同放置和数据，以及存储和传输的数据。我们分析了Thetwo突出范式的生态系统，以获得数据密集型应用，以后作为高性能计算和Theapache-Hadoop范例。我们提出了一个普遍的终结和功能因素，用于分析两个范式的两种方法。我们讨论了“大Dataogres”的概念及其方面作为理解和表征最常见的应用程序工作负载的手段，发现了ACTOSSTHE两种范式。然后，我们讨论Thetwo范式的突出特征，并比较和对比两种方法。特殊地，我们研究了这些范式的普通实施/方法，阐明了他们当前的“架构”的原因，并讨论了一些典型的工作负载。尽管有了重要的软件区分，但我们相信是建筑相似性。我们讨论不同实现的潜在集成，跨越不同的级别和组件。我们的比较从两种范例的完全定性审查到半量化审查。我们使用简单且广泛地使用的OGRE（K-MaysClustering），其表征其在一系列代表图中的性能，从两个代表性上覆盖了来自BotharDigms的几种实现。我们的实验提供了对两个范式的相关重温的洞察。我们建议的一组OGRESWILL作为基准，以评估双方vifferent尺寸的两个范例。

著录项

来源
《IEEE International Congress on Big Data》|2014年|645-652|共8页
会议地点
作者
Jha Shantenu; Qiu Judy; Luckow Andre; Mantha Pradeep; Fox Geoffrey C.;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Big data; Computer architecture; Ecosystems; Processor scheduling; Runtime; Sparks; Yarn;

机译：大数据;计算机架构;生态系统;处理器调度;运行;火花;纱;

相似文献

外文文献
中文文献
专利

1. Creating a portable, high-level graph analytics paradigm for compute and data-intensive applications [J] . Robert Searles, Stephen Herbein, Travis Johnston, International Journal of High Performance Computing and Networking . 2019,第1期

机译：为计算和数据密集型应用程序创建便携式，高级图分析范例
2. An intelligent memory caching architecture for data-intensive multimedia applications [J] . Abbasi Aaqif Afzaal, Javed Sameen, Shamshirband Shahaboddin Multimedia Tools and Applications . 2021,第11期

机译：用于数据密集型多媒体应用的智能内存缓存架构
3. Memristor based computation-in-memory architecture for data-intensive applications [J] . Anoop Malaviya Computing reviews . 2015,第12期

机译：基于忆阻器的内存密集型计算架构，用于数据密集型应用
4. A Tale of Two Data-Intensive Paradigms: Applications, Abstractions, and Architectures [C] . Jha Shantenu, Qiu Judy, Luckow Andre, IEEE International Congress on Big Data . 2014

机译：两个数据密集型范式的故事：应用程序，抽象和架构
5. Efficient PIM (Processor-In-Memory) architectures for data-intensive applications. [D] . Kang, Jung-Yup. 2004

机译：适用于数据密集型应用程序的高效PIM（内存中处理器）体系结构。
6. Hybrid Clouds for Data-Intensive 5G-Enabled IoT Applications: An Overview Key Issues and Relevant Architecture [O] . Panagiotis Trakadas, Nikolaos Nomikos, Emmanouel T. Michailidis, 2019

机译：适用于数据密集型启用5G的IoT应用的混合云：概述关键问题和相关架构
7. A Tale of Two Data-Intensive Paradigms: Applications, Abstractions, and Architectures [O] . Shantenu Jha, Judy Qiu, Andre Luckow, 2014

机译：两个数据密集型范例：应用程序，抽象和体系结构
8. Software architecture for large scale, distributed, data-intensive systems [R] . Mattmann, Chris A., Medvidovic, Nenad, Ramirez, Paul M. 2004

机译：适用于大规模，分布式，数据密集型系统的软件架构

A Tale of Two Data-Intensive Paradigms: Applications, Abstractions, and Architectures

摘要

著录项

相似文献

相关主题

期刊订阅