首页> 外文会议>International frontiers in algorithmics workshop >A New Algorithm for Intermediate Dataset Storage in a Cloud-Based Dataflow
【24h】

A New Algorithm for Intermediate Dataset Storage in a Cloud-Based Dataflow

机译:基于云的数据流中中间数据集存储的新算法

获取原文

摘要

Running a dataflow in a cloud environment usually generates many useful intermediate datasets. A strategy for running a dataflow is to decide which datasets should be stored, while the rest of them are regenerated. The intermediate dataset storage (IDS) problem asks to find a strategy for running a dataflow, such that the total cost is minimized. The current best algorithm for linear-structure IDS takes O(n~4) time, where "linear-structure" means that the structure of the datasets in the dataflow is a pipeline. In this paper, we present a new algorithm for this problem, and improve the time complexity to O(n~3), where n is the number of datasets in the pipeline.
机译:在云环境中运行数据流通常会生成许多有用的中间数据集。运行数据流的策略是确定应存储哪些数据集,而其余数据集将被重新生成。中间数据集存储(IDS)问题要求找到一种用于运行数据流的策略,以使总成本最小化。当前用于线性结构IDS的最佳算法需要O(n〜4)时间,其中“线性结构”表示数据流中数据集的结构是管道。在本文中,我们提出了一个解决该问题的新算法,并将时间复杂度提高到O(n〜3),其中n是流水线中的数据集数量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号