...
首页> 外文期刊>Computers >Beyond Batch Processing: Towards Real-Time and Streaming Big Data
【24h】

Beyond Batch Processing: Towards Real-Time and Streaming Big Data

机译:批处理之外:实时和流式传输大数据

获取原文
           

摘要

Today, big data are generated from many sources, and there is a huge demand for storing, managing, processing, and querying on big data. The MapReduce model and its counterpart open source implementation Hadoop, has proven itself as the de facto solution to big data processing, and is inherently designed for batch and high throughput processing jobs. Although Hadoop is very suitable for batch jobs, there is an increasing demand for non-batch requirements like: interactive jobs, real-time queries, and big data streams. Since Hadoop is not suitable for these non-batch workloads, new solutions are proposed to these new challenges. In this article, we discussed two categories of these solutions: real-time processing, and stream processing of big data. For each category, we discussed paradigms, strengths and differences to Hadoop. We also introduced some practical systems and frameworks for each category. Finally, some simple experiments were performed to approve effectiveness of new solutions compared to available Hadoop-based solutions.
机译:如今,大数据来自许多来源,并且对大数据的存储,管理,处理和查询有巨大的需求。 MapReduce模型及其对应的开源实现Hadoop已证明其是大数据处理的事实上的解决方案,并且其本质上是为批处理和高吞吐量处理作业而设计的。尽管Hadoop非常适合批处理作业,但对非批处理要求的需求却在不断增长,例如:交互式作业,实时查询和大数据流。由于Hadoop不适合这些非批处理工作负载,因此针对这些新挑战提出了新的解决方案。在本文中,我们讨论了这些解决方案的两类:实时处理和大数据流处理。对于每个类别,我们都讨论了Hadoop的范式,优势和差异。我们还为每个类别介绍了一些实用的系统和框架。最后,与可用的基于Hadoop的解决方案相比,执行了一些简单的实验来批准新解决方案的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号