首页> 外文会议>Iberian Conference on Information Systems and Technologies >Predicting master's applicants performance using KDD techniques
【24h】

Predicting master's applicants performance using KDD techniques

机译:使用KDD技术预测硕士申请者的表现

获取原文

摘要

The selection process of master's applicants has serious implications for the University. Students must contribute to science through scientific publications in congresses and journals, increasing the institution levels. However, predicting its performance demand advisor's experience and time. The advisors must select from a group of projects which one is the best for both sides. This procedure can lead to wrong choices. With the purpose of enhancing the selection process of masters' students, this paper presents an analysis of the academic characteristics of master's education applicants. It was used to identify set of features that might help to predict an applicant's scientific production. Through the process of Knowledge Discovery in Database, was obtained a set of correlation rules between features and the expected applicant's performance. Data collection was made on the Lattes Platform (A Brazilian platform to Scientific Curricula). Then, after pre-processed, it was optimized using the Partition Around Medoids with Gower distance. Experiments were conducted by training seven classification algorithms in two approaches. The first was a naive approach, making the analysis as simple as possible and the second one, develop an optimized process of DM. The optimization used the clustering process to create a new main class for the classifiers. Results demonstrate that the optimized approach was able to achieve an accuracy around 85% using the Random Forest, while the naive has approximately 44%. Through the PAM algorithm, it was possible to determine the best attributes and highlight the main rules. The conclusion is that the category of University and their previous production was the most relevant attributes. It shows that is possible to predict the productivity of master's degree students.
机译:硕士申请者的甄选过程对大学产生了严重的影响。学生必须通过在大会和期刊上发表科学论文来为科学做出贡献,从而提高机构水平。但是,预测其性能需要顾问的经验和时间。顾问必须从一组项目中选择哪一个对双方都是最好的。此过程可能导致错误的选择。为了加强硕士研究生的选拔过程,本文对硕士教育申请者的学业特点进行了分析。它用于识别可以帮助预测申请人的科学成果的一组功能。通过数据库中的知识发现过程,获得了一组功能与预期申请人的表现之间的关联规则。数据收集是在Lattes平台(科学课程的巴西平台)上进行的。然后,在经过预处理之后,使用具有高尔距离的“分布在类固醇周围”对它进行优化。通过以两种方法训练七种分类算法来进行实验。第一种是幼稚的方法,它使分析尽可能简单,而第二种方法则开发了优化的DM过程。优化使用聚类过程为分类器创建新的主类。结果表明,使用随机森林优化的方法能够达到约85%的精度,而朴素的约有44%。通过PAM算法,可以确定最佳属性并突出显示主要规则。结论是,大学的类别及其以前的产品是最相关的属性。它表明可以预测硕士学位学生的生产力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号