首页> 外文学位 >Enabling Large-Scale Mining Software Repositories (MSR) Studies Using Web-Scale Platforms.
【24h】

Enabling Large-Scale Mining Software Repositories (MSR) Studies Using Web-Scale Platforms.

机译:使用Web规模平台启用大规模采矿软件资源库(MSR)研究。

获取原文
获取原文并翻译 | 示例

摘要

The Mining Software Repositories (MSR) field analyzes software data to uncover knowledge and assist software developments. Software projects and products continue to grow in size and complexity. In-depth analysis of these large systems and their evolution is needed to better understand the characteristics of such large-scale systems and projects. However, classical software analysis platforms (e.g., Prolog-like, SQL-like, or specialized programming scripts) face many challenges when performing large-scale MSR studies. Such software platforms rarely scale easily out of the box. Instead, they often require analysis-specific one-time ad hoc scaling tricks and designs that are not reusable for other types of analysis and that are costly to maintain. We believe that the web community has faced many of the scaling challenges facing the software engineering community, as they cope with the enormous growth of the web data. In this thesis, we report on our experience in using MapReduce and Pig, two web-scale platforms, to perform large MSR studies. Through our case studies, we carefully demonstrate the benefits and challenges of using web platforms to prepare (i.e., Extract, Transform, and Load, ETL) software data for further analysis. The results of our studies show that: (1) web-scale platforms provide an effective and efficient platform for large-scale MSR studies; (2) many of the web community's guidelines for using web-scale platforms must be modified to achieve the optimal performance for large-scale MSR studies. This thesis will help other software engineering researchers who want to scale their studies.
机译:采矿软件资源库(MSR)字段分析软件数据以发现知识并协助软件开发。软件项目和产品的规模和复杂性不断增长。需要对这些大型系统及其演化进行深入分析,以更好地了解此类大型系统和项目的特征。但是,传统的软件分析平台(例如,类似于Prolog,类似于SQL或专门的编程脚本)在进行大规模MSR研究时面临许多挑战。这样的软件平台很少能轻松扩展。相反,他们通常需要特定于分析的一次性临时缩放技巧和设计,这些技巧和设计不可用于其他类型的分析,并且维护成本很高。我们认为,Web社区在应对Web数据的巨大增长时面临着软件工程社区面临的许多扩展挑战。在本文中,我们报告了我们使用MapReduce和Pig(两个Web规模平台)进行大型MSR研究的经验。通过我们的案例研究,我们仔细演示了使用网络平台准备(即提取,转换和加载ETL)软件数据进行进一步分析的好处和挑战。我们的研究结果表明:(1)网络规模的平台为大规模的MSR研究提供了有效的平台; (2)必须修改许多Web社区使用Web规模平台的指南,以实现大规模MSR研究的最佳性能。本文将为其他想要扩展研究规模的软件工程研究人员提供帮助。

著录项

  • 作者

    Shang, Weiyi.;

  • 作者单位

    Queen's University (Canada).;

  • 授予单位 Queen's University (Canada).;
  • 学科 Computer Science.
  • 学位 M.Sc.
  • 年度 2010
  • 页码 163 p.
  • 总页数 163
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号