首页> 外文会议>IEEE International Conference on Software Maintenance and Evolution >Adapting Neural Text Classification for Improved Software Categorization
【24h】

Adapting Neural Text Classification for Improved Software Categorization

机译:调整神经文本分类以改进软件分类

获取原文

摘要

Software Categorization is the task of organizing software into groups that broadly describe the behavior of the software, such as "editors" or "science." Categorization plays an important role in several maintenance tasks, such as repository navigation and feature elicitation. Current approaches attempt to cast the problem as text classification, to make use of the rich body of literature from the NLP domain. However, as we will this paper, algorithms are generally not applicable off-the-shelf to source code; we found that they work well when high-level project descriptions are available, but suffer very large performance penalties when classifying sourcecode and comments only. We propose a set of adaptations to a state-of-the-art neural classification algorithm and perform two evaluations: one with reference data from Debian end-user programs, and one with a set of C/C++ libraries that we hired professional programmers to annotate. We show that our proposed approach achieves performance exceeding that of previous software classification techniques as well as a state-of-the-art neural text classification technique.
机译:软件分类是将软件组织为大致描述软件行为的组的任务,例如“编辑者”或“科学”。分类在几个维护任务中起着重要作用,例如存储库导航和功能获取。当前的方法试图将问题归为文本分类,以利用NLP领域中丰富的文献资料。但是,正如我们将在本文中所述,算法通常不适用于源代码。我们发现,当可以使用高级项目描述时,它们可以很好地工作,但是仅对源代码和注释进行分类时,它们会遭受很大的性能损失。我们提出了一套适应最新技术的神经分类算法的方案,并进行了两种评估:一种采用Debian最终用户程序的参考数据,另一种采用我们聘用的专业程序员提供的C / C ++库。注释。我们表明,我们提出的方法所实现的性能超过了以前的软件分类技术以及最新的神经文本分类技术。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号