首页> 外文会议>IEEE International Conference on Software Maintenance and Evolution >Adapting Neural Text Classification for Improved Software Categorization
【24h】

Adapting Neural Text Classification for Improved Software Categorization

机译:适应神经文本分类,以改进软件分类

获取原文

摘要

Software Categorization is the task of organizing software into groups that broadly describe the behavior of the software, such as "editors" or "science." Categorization plays an important role in several maintenance tasks, such as repository navigation and feature elicitation. Current approaches attempt to cast the problem as text classification, to make use of the rich body of literature from the NLP domain. However, as we will this paper, algorithms are generally not applicable off-the-shelf to source code; we found that they work well when high-level project descriptions are available, but suffer very large performance penalties when classifying sourcecode and comments only. We propose a set of adaptations to a state-of-the-art neural classification algorithm and perform two evaluations: one with reference data from Debian end-user programs, and one with a set of C/C++ libraries that we hired professional programmers to annotate. We show that our proposed approach achieves performance exceeding that of previous software classification techniques as well as a state-of-the-art neural text classification technique.
机译:软件分类是将软件组织成群组的任务,以广泛地描述软件的行为,例如“编辑”或“科学”。分类在多个维护任务中扮演一个重要的角色,例如存储库导航和特征诱因。目前的方法试图将问题施放为文本分类,以利用来自NLP域的丰富文献。但是,正如我们本文的那样,算法通常不适用于源代码;我们发现,当高级项目描述可用时,他们工作很好,但在分类源代码和评论时遭遇非常大的性能惩罚。我们向最先进的神经分类算法提出了一组适应性,并执行了两个评估:一个具有来自Debian最终用户程序的参考数据的评估,以及我们聘请专业程序员的一组C / C ++库。注释。我们表明,我们的建议方法实现了超过先前的软件分类技术以及最先进的神经文本分类技术的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号