A New Algorithm for Intermediate Dataset Storage in a Cloud-Based Dataflow

机译：基于云的数据流中中间数据集存储的新算法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Running a dataflow in a cloud environment usually generates many useful intermediate datasets. A strategy for running a dataflow is to decide which datasets should be stored, while the rest of them are regenerated. The intermediate dataset storage (IDS) problem asks to find a strategy for running a dataflow, such that the total cost is minimized. The current best algorithm for linear-structure IDS takes O(n~4) time, where "linear-structure" means that the structure of the datasets in the dataflow is a pipeline. In this paper, we present a new algorithm for this problem, and improve the time complexity to O(n~3), where n is the number of datasets in the pipeline.

机译：在云环境中运行数据流通常会生成许多有用的中间数据集。运行数据流的策略是确定应存储哪些数据集，而其余数据集将被重新生成。中间数据集存储（IDS）问题要求找到一种用于运行数据流的策略，以使总成本最小化。当前用于线性结构IDS的最佳算法需要O（n〜4）时间，其中“线性结构”表示数据流中数据集的结构是管道。在本文中，我们提出了一个解决该问题的新算法，并将时间复杂度提高到O（n〜3），其中n是流水线中的数据集数量。

著录项

来源
《International frontiers in algorithmics workshop》|2015年|33-44|共12页
会议地点
作者
Jie Cheng; Darning Zhu; Binhai Zhu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Risk-aware intermediate dataset backup strategy in cloud-based data intensive workflows [J] . Mingzhong Wang, Liehuang Zhu, Zijian Zhang Future generation computer systems . 2016,第FEBa期

机译：基于云的数据密集型工作流程中具有风险意识的中间数据集备份策略
2. On-demand minimum cost benchmarking for intermediate dataset storage in scientific cloud workflow systems [J] . Dong Yuan, Yun Yang, Xiao Liu, Journal of Parallel and Distributed Computing . 2011,第2期

机译：科学云工作流系统中中间数据集存储的按需最低成本基准测试
3. Privacy preserving of intermediate dataset using hybridisation of oppositional gravitational search algorithm and elliptic curve cryptography [J] . S. Saravanan, V. Venkatachalam International journal of business information systems . 2019,第2期

机译：对立重力搜索算法和椭圆曲线密码学混合保护中间数据集的隐私
4. A New Algorithm for Intermediate Dataset Storage in a Cloud-Based Dataflow [C] . Jie Cheng, Daming Zhu, Binhai Zhu International Frontiers of Algorithmics Workshop . 2015

机译：一种新的基于云的DataFlow中的中间数据集存储算法
5. Cloud-Based Analysis and Integration of Proteomics and Metabolomics Datasets [D] . Choi, Jeong Ho Howard. 2019

机译：基于云的分析与蛋白质组学和代谢组合数据集的整合
6. Acceleration of Image Segmentation Algorithm for (Breast) Mammogram Images Using High-Performance Reconfigurable Dataflow Computers [O] . Ivan L. Milankovic, Nikola V. Mijailovic, Nenad D. Filipovic, 2017

机译：使用高性能可重构数据流计算机加速（乳房）乳房X线图像的图像分割算法
7. Network Flow with Intermediate Storage: Models and Algorithms [O] . Urmila Pyakurel, Stephan Dempe 2020

机译：网络流量中间存储：模型和算法

A New Algorithm for Intermediate Dataset Storage in a Cloud-Based Dataflow

摘要

著录项

相似文献

相关主题

期刊订阅