首页> 外文期刊>Parallel and Distributed Systems, IEEE Transactions on >Parallel Processing of Dynamic Continuous Queries over Streaming Data Flows
【24h】

Parallel Processing of Dynamic Continuous Queries over Streaming Data Flows

机译:流数据流上动态连续查询的并行处理

获取原文
获取原文并翻译 | 示例
           

摘要

More and more real-time applications need to handle dynamic continuous queries over streaming data of high density. Conventional data and query indexing approaches generally do not apply for excessive costs in either maintenance or space. Aiming at these problems, this study first proposes a new indexing structure by fusing an adaptive cell and KDB-tree, namely CKDB-tree. A cell-tree indexing approach has been developed on the basis of the CKDB-tree that supports dynamic continuous queries. The approach significantly reduces the space costs and scales well with the increasing data size. Towards providing a scalable solution to filtering massive steaming data, this study has explored the feasibility to utilize the contemporary general-purpose computing on the graphics processing unit (GPGPU). The CKDB-tree-based approach has been extended to operate on both the CPU (host) and the GPU (device). The GPGPU-aided approach performs query indexing on the host while perform streaming data filtering on the device in a massively parallel manner. The two heterogeneous tasks execute in parallel and the latency of streaming data transfer between the host and the device is hidden. The experimental results indicate that (1) CKDB-tree can reduce the space cost comparing to the cell-based indexing structure by 60 percent on average, (2) the approach upon the CKDB-tree outperforms the traditional counterparts upon the KDB-tree by 66, 75 and 79 percent in average for uniform, skewed and hyper-skewed data in terms of update costs, and (3) the GPGPU-aided approach greatly improves the approach upon the CKDB-tree with the support of only a single Kepler GPU, and it provides real-time filtering of streaming data with 2.5M data tuples per second. The massively parallel computing technology exhibits great potentials in streaming data monitoring.
机译:越来越多的实时应用程序需要处理高密度流数据上的动态连续查询。传统的数据和查询索引方法通常不会在维护或空间上花费过多的成本。针对这些问题,本研究首先通过将自适应单元和KDB树(即CKDB树)融合,提出了一种新的索引结构。在支持动态连续查询的CKDB树的基础上开发了一种单元树索引方法。该方法显着降低了空间成本,并随着数据大小的增加而很好地扩展。为了提供可扩展的解决方案以过滤大量的蒸汽数据,本研究探索了在图形处理单元(GPGPU)上利用当代通用计算的可行性。基于CKDB树的方法已扩展为可以在CPU(主机)和GPU(设备)上运行。 GPGPU辅助方法在主机上执行查询索引,同时以大规模并行方式在设备上执行流数据过滤。这两个异构任务并行执行,并且主机和设备之间的流数据传输延迟被隐藏。实验结果表明:(1)与基于单元的索引结构相比,CKDB树可将空间成本平均降低60%;(2)CKDB树的方法比KDB树的方法要好于传统方法。就更新成本而言,均匀,偏斜和超偏斜数据的平均值分别为66%,75%和79%;(3)GPGPU辅助方法仅通过单个Kepler GPU的支持就大大改进了CKDB树上的方法。 ,并以每秒250万个数据元组的速度对流数据进行实时过滤。大规模并行计算技术在流数据监视中显示出巨大的潜力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号