首页> 外文会议>Program Comprehension, 2009. ICPC '09 >An empirical exploration of regularities in open-source software lexicons
【24h】

An empirical exploration of regularities in open-source software lexicons

机译:开源软件词典中的规律性的经验探索

获取原文

摘要

The software lexicon is an important source of information during program comprehension activities and it has been in the focus of several recent case studies. Identifiers and comments, which constitute a lexicon in software, encode domain concepts and design decisions made by programmers. The paper presents an exploratory study that investigates regularities in the software lexicons of open-source projects by analyzing distributions of tokens in diverse software artifacts. The study examined source code of 142 systems from different domains, written in 12 different programming languages, as well as bug reports and external documentation. We discover that distributions of lexical tokens in studied artifacts follow the Zipf-Mandelbrot law, which is an empirical law in statistical natural language processing. Furthermore, the study reveals that the Zipf-Mandelbrot law is not confined to program lexicons in object-oriented languages, as shown in the previous studies, but also emerges in source code written using procedural, functional and markup languages, as well as other software artifacts. Our study also indicates that a previously devised software science equation does not hold for describing the program vocabulary-length relationship and more studies are necessary.
机译:在程序理解活动中,软件词典是重要的信息来源,并且它已经成为最近几个案例研究的重点。标识符和注释构成软件中的词典,对领域概念和程序员做出的设计决策进行编码。本文提出了一项探索性研究,通过分析各种软件工件中令牌的分布来研究开源项目的软件词典中的规律性。这项研究检查了来自不同领域的142个系统的源代码,它们以12种不同的编程语言编写,以及错误报告和外部文档。我们发现在研究的工件中词汇标记的分布遵循Zipf-Mandelbrot定律,这是统计自然语言处理中的经验定律。此外,该研究表明,Zipf-Mandelbrot定律不像以前的研究所示那样局限于面向对象语言的程序词典,而且还出现在使用过程,功能和标记语言以及其他软件编写的源代码中文物。我们的研究还表明,以前设计的软件科学方程式无法描述程序的词汇长度关系,因此有必要进行更多研究。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号