【24h】

A Middleware for Managing Big-Data Flows

机译:用于管理大数据流的中间件

获取原文

摘要

Hadoop is being used for various diverse kinds of applications over diverse kinds of data. This makes developing and managing data flows over Hadoop MapReduce a complex task. Various scripting languages such as Hive, Pig, Jaql, etc., have been developed to hide the complexity of MapReduce applications from the user. But, even these high level query languages can get complex over-time and it is a non-trivial task even for a user proficient in these languages to develop, debug, and maintain these scripts. This paper presents a middleware for developing and maintaining MapReduce data flows. This middleware can be used to Extract data from diverse data sources, Load it into distributed file system, and Transform in a format which can be easily analyzed by the subsequent systems in a user friendly manner. MetaOperators are the backbone of our middleware. Using MetaOperators one can express a data-flow only by specifying the relevant inputs rather than worrying about data schema and the query syntax. A data-flow written using such MetaOperators localizes schema specific parts of the query to the MetaOperator parameters making the flow easier to develop, debug, and maintain. Using these MetaOperators we show how one can express operations over hierarchical as well as flat data in a similar manner, track data schema as it flows through the operators, and add a drag-and-drop GUI layer on top of this framework. This brings MapReduce application development in the realm of middle management.
机译:Hadoop正在用于各种数据的各种不同类型的应用。这使得开发和管理数据流过Hadoop MapReduce一个复杂的任务。已经开发出各种脚本语言,如Hive,Pig,JAQL等,以隐藏来自用户的MapReduce应用程序的复杂性。但是,即使是这些高级查询语言也可以获得复杂的时间,即使是为了开发,调试和维护这些脚本的用户熟练的用户熟练,也是一个非琐碎的任务。本文介绍了用于开发和维护MapReduce数据流的中间件。该中间件可用于从各种数据源中提取数据,将其加载到分布式文件系统中,并以以用户友好的方式通过后续系统容易地分析的格式。 Metaoperators是我们中间件的骨干。使用Metaoperators可以通过指定相关输入来表达数据流,而不是担心数据模式和查询语法。使用此类Metaoperators编写的数据流定位查询的架构特定部分,使流程更容易开发,调试和维护。使用这些元化器,我们展示如何以类似的方式表达分层和平面数据的操作,跟踪数据模式,在流过运营商时,并在此框架之上添加拖放GUI层。这为中间管理领域带来了MapReduce应用程序开发。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号