首页> 中文期刊> 《计算机学报》 >云环境下优化科学工作流执行性能的两阶段数据放置与任务调度策略

云环境下优化科学工作流执行性能的两阶段数据放置与任务调度策略

         

摘要

Scientific workflows in collaborative cloud environments are becoming more and more popular. There is an urgent need to address the problem of large amount of data transfer across geo-distributed data centers during workflow execution. By utilizing data dependencies, we propose a two-stage data placement strategy and a task scheduling strategy for efficient workflow execution. With our strategy, the most related datasets can be placed into the same data center based on the data dependence between them at workflow build-time; then the tasks are scheduled to their most closely related data centers for execution and the newly-generated data sets are put into the data center that has the most dependency with them at workflow runtime. The experimental results show that the proposed strategy can significantly reduce the volume of data transfer among different data centers, and hence improve the performance of running scientific workflows and cut down the cost of doing science on the clouds as well.%云环境中跨数据中心科学工作流的高效执行通常面临数据交互量大的问题.文中给出基于相关度的两阶段高效数据放置策略和任务调度策略:即在工作流建立阶段根据数据依赖关系图把关系紧密型数据集尽可能放置到同一数据中心;而后任务调度策略在运行阶段将任务调度到数据依赖最大的数据中心执行,并将新产生数据集放置到相关度最高的数据中心.实验表明,该策略能有效减少跨数据中心科学工作流执行时的数据传输量,从而能有效提升科学工作流的执行效率,并能减少资源的租赁费用.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号