首页> 外文会议>IEEE International Symposium on Software Reliability Engineering >Switching to Git: The Good, the Bad, and the Ugly
【24h】

Switching to Git: The Good, the Bad, and the Ugly

机译:切换到Git:好,坏和丑

获取原文

摘要

Since its introduction 10 years ago, GIT has taken the world of version control systems (VCS) by storm. Its success is partly due to creating opportunities for new usage patterns that empower developers to work more efficiently. However, the resulting change in both user behavior and the way GIT stores changes impacts data mining and data analytics procedures [6], [13]. While some of these unique characteristics can be managed by adjusting mining and analytical techniques, others can lead to severe data loss and the inability to audit code changes, e.g. knowing the full history of changes of code related to security and privacy functionality. Thus, switching to GIT comes with challenges to established development process analytics. This paper is based on our experience in attempting to provide continuous process analysis for Microsoft product teams who switching to GIT as their primary VCS. We illustrate how GIT's concepts and usage patterns create a need for changing well-established data analytic processes. The goal of this paper is to raise awareness how certain GIT operations may damage or even destroy information about historical code changes necessary for continuous data development process analytics. To that end, we provide a list of common GIT usage patterns with a description of how these operations impact data mining applications. Finally, we provide examples of how one may counteract the effects of such destructive operations in the future. We further provide a new algorithm to detect integration paths that is specific to distributed version control systems like GIT, which allows us to reconstruct the information that is crucial to most development process analytics.
机译:自10年前推出以来,GIT席卷了版本控制系统(VCS)的世界。它的成功部分是由于为新的使用模式创造了机会,使开发人员能够更有效地工作。然而,用户行为和GIT存储更改方式的结果变化会影响数据挖掘和数据分析过程[6],[13]。虽然可以通过调整挖掘和分析技术来管理其中一些独特的特征,但其他特征可能导致严重的数据丢失和无法审核代码更改,例如了解与安全和隐私功能相关的代码更改的完整历史记录。因此,切换到GIT会给已建立的开发流程分析带来挑战。本文基于我们尝试为使用GIT作为主要VCS的Microsoft产品团队提供连续过程分析的经验。我们将说明GIT的概念和使用模式如何产生对更改完善的数据分析过程的需求。本文的目的是提高人们的认识,即某些GIT操作可能如何破坏甚至破坏有关连续数据开发过程分析所需的历史代码更改的信息。为此,我们提供了常见的GIT使用模式列表,并描述了这些操作如何影响数据挖掘应用程序。最后,我们提供了一个示例,说明将来人们如何抵消这种破坏性行动的影响。我们进一步提供了一种新的算法来检测集成路径,该算法特定于分布式版本控制系统(如GIT),这使我们能够重建对大多数开发过程分析至关重要的信息。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号