Planning Your SQL-on-Hadoop Deployment Using a Low-Cost Simulation-Based Approach

机译：使用基于低成本仿真的方法来计划SQL-on-Hadoop部署

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The term "SQL-on-Hadoop" has recently gained significant traction [19]. Impala represents a new emerging class of SQL-on-Hadoop systems that exploit a shared-nothing parallel database architecture over Hadoop. Impala was designed to close the gap of near real time data analytics on Hadoop stack and it has shown itself to be significantly more efficient than other SQL-on-Hadoop solutions [13]. However, it is not a trivial task to leverage Impala for handling queries with different business demands [12]. Improperly deploying an Impala cluster may not give you the expected performance you want. In this paper, we propose a novel Impala simulation framework to help IT professionals to understand its performance behavior. This would simplify the deployment planning work required to enable big data analytics on SQL-on-Hadoop systems. An Impala simulator models the behavior of a complete software stack and simulates the activities of cluster components such as storage, network, processors and memory. Moreover, the accuracy of the simulation remain high in response to both software configuration and hardware changes, it reflects the expected scaling trend with low cost overhead and fast simulation speed. The Impala simulator has been validated against various S/W and H/W configurations, using the well-known TPC-DS benchmark [15], and the simulation results are valid and expected. A use case is provided to show how one would use the simulator to solve their performance and deployment issues.

机译：术语“ SQL-on-Hadoop”最近获得了广泛的关注[19]。 Impala代表了一种新兴的基于SQL的Hadoop系统，该类在Hadoop上利用了无共享并行数据库体系结构。 Impala旨在缩小Hadoop堆栈上近实时数据分析的差距，并且已证明其自身比其他SQL-on-Hadoop解决方案效率更高[13]。但是，利用Impala处理具有不同业务需求的查询并不是一件容易的事[12]。不正确地部署Impala群集可能无法为您提供所需的预期性能。在本文中，我们提出了一种新颖的Impala仿真框架，以帮助IT专业人员了解其性能行为。这将简化在SQL-on-Hadoop系统上启用大数据分析所需的部署计划工作。 Impala模拟器对完整软件堆栈的行为进行建模，并模拟群集组件（如存储，网络，处理器和内存）的活动。此外，响应于软件配置和硬件更改，仿真的准确性仍然很高，它以低成本开销和快速仿真速度反映了预期的缩放趋势。使用著名的TPC-DS基准测试[15]，Impala仿真器已针对各种软件和硬件配置进行了验证，仿真结果是有效且可预期的。提供了一个用例，以说明如何使用模拟器来解决其性能和部署问题。

著录项

来源
《IEEE International Symposium on Computer Architecture and High Performance Computing》|2016年|182-189|共8页
会议地点
作者
Jun Liu; Bianny Bian; Samantika Subramaniam Sury;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Software; Hardware; Computer architecture; Metadata; Planning; Servers;

机译：软件;硬件;计算机体系结构;元数据;计划;服务器;

相似文献

外文文献
中文文献
专利

1. A risk-averse simulation-based approach for a joint optimization of workforce capacity, spare part stocks and scheduling priorities in maintenance planning [J] . Turan Hasan Huseyin, Atmis Mahir, Kosanoglu Fuat, Reliability Engineering & System Safety . 2020,第Deca期

机译：基于风险的仿真仿真方法，用于联合优化员工能力，备件股票和维护计划中的调度优先事项
2. A simulation-based approach for plant layout design and production planning [J] . Zhang Zhinan, Wang Xin, Wang Xiaohan, Journal of ambient intelligence and humanized computing . 2019,第3期

机译：用于工厂布局设计和生产计划的基于仿真的方法
3. An optimal concurrent product design and service planning approach through simulation-based evaluation considering the whole product life-cycle span [J] . Liu Hang, Chu Xuening, Xue Deyi Computers in Industry . 2019,第期

机译：考虑到整个产品生命周期跨度，通过基于模拟的评估实现最佳的并发产品设计和服务计划方法
4. Planning Your SQL-on-Hadoop Deployment Using a Low-Cost Simulation-Based Approach [C] . Jun Liu, Bianny Bian, Samantika Subramaniam Sury International Symposium on Computer Architecture and High Performance Computing . 2016

机译：使用基于低成本模拟的方法规划SQL-On-Hadoop部署
5. Simulation-based optimization for integrated electric utility resource planning and deployment. [D] . Saenz, Juan P. 2013

机译：基于仿真的优化，用于集成电力资源规划和部署。
6. Skills Transfer to Sinus Surgery via a Low-Cost Simulation-Based Curriculum [O] . R. Alex Harbison, Jennifer Dunlap, Ian M. Humphreys, -1

机译：通过基于低成本模拟的课程将技能转移到鼻窦外科
7. Initial Deployment of a Robotic Team- A Hierarchical Approach Under Communication Constraints Verified on Low-Cost Platforms [O] . Micael S. Couceiro, Ieee Student Member, Carlos M. Figueiredo, 2013

机译：机器人团队的初始部署 - 在低成本平台上验证的通信约束下的分层方法

Planning Your SQL-on-Hadoop Deployment Using a Low-Cost Simulation-Based Approach

摘要

著录项

相似文献

相关主题

期刊订阅