首页> 外文会议>IEEE International Conference on Software Maintenance and Evolution >Using Observed Behavior to Reformulate Queries during Text Retrieval-based Bug Localization
【24h】

Using Observed Behavior to Reformulate Queries during Text Retrieval-based Bug Localization

机译:在基于文本检索的错误本地化期间,使用观察到的行为来重新重新询问查询

获取原文

摘要

Text Retrieval (TR)-based approaches for bug localization rely on formulating an initial query based on a bug report. Often, the query does not return the buggy software artifacts at or near the top of the list (i.e., it is a low-quality query). In such cases, the query needs reformulation. Existing research on supporting developers in the reformulation of queries focuses mostly on leveraging relevance feedback from the user or expanding the original query with additional information (e.g., adding synonyms). In many cases, the problem with such lowquality queries is the presence of irrelevant terms (i.e., noise) and previous research has shown that removing such terms from the queries leads to substantial improvement in code retrieval. Unfortunately, the current state of research lacks methods to identify the irrelevant terms. Our research aims at addressing this problem and our conjecture is that reducing a low-quality query to only the terms describing the Observed Behavior (OB) can improve TR-based bug localization. To verify our conjecture, we conducted an empirical study using bug data from 21 open source systems to reformulate 451 low-quality queries. We compare the accuracy achieved by four TR-based bug localization approaches at three code granularities (i.e., files, classes, and methods), when using the complete bug reports as queries versus a reduced version corresponding to the OB only. The results show that the reformulated queries improve TR-based bug localization for all approaches by 147.4% and 116.6% on average, in terms of MRR and MAP, respectively. We conclude that using the OB descriptions is a simple and effective technique to reformulate low-quality queries during TR-based bug localization.
机译:文本检索(TR)基于Bug报告的初始查询依赖于制定初始查询的错误定位方法。通常,查询不会返回列表顶部或附近的错误软件工件(即,它是一个低质量的查询)。在这种情况下,查询需要重新制定。关于查询的重新素中的支持开发人员的现有研究主要集中在利用来自用户的相关反馈或使用其他信息扩展原始查询(例如,添加同义词)。在许多情况下,这种低质量查询的问题是存在无关的术语(即,噪声)和先前的研究表明,从查询中移除这些术语导致代码检索的大量改进。不幸的是,目前的研究状态缺乏识别无关术语的方法。我们的研究旨在解决这个问题,我们的猜想是降低了低质量查询,只能仅对描述观察到的行为(OB)的术语来改善基于TR的错误本地化。为了验证我们的猜想,我们使用来自21个开源系统的错误数据进行了实证研究,以重新设计451个低质量查询。我们比较四个代码粒度(即文件,类和方法)的四个基于TR基本的错误定位方法所实现的准确性,当使用完整的错误报告时,因为查询与对应于OB对应的缩小版本。结果表明,根据MRR和地图,重新制定的查询将改善所有方法的基于TR基于TR的BUG定位。我们得出结论,使用OB描述是一种简单有效的技术,可以在基于TR的错误本地化期间重新设计低质量查询。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号