Evaluation of data analytics based clustering algorithms for knowledge mining in a student engagement data

Oladipupo O. O.; Olugbara O. O.

首页> 外文期刊>Intelligent data analysis >Evaluation of data analytics based clustering algorithms for knowledge mining in a student engagement data

【24h】

Evaluation of data analytics based clustering algorithms for knowledge mining in a student engagement data

机译：基于数据分析的基于数据分析的聚类算法评估学生参与数据中的知识挖掘

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The application of algorithms based on data analytics for the task of knowledge mining in a student dataset is an important strategy for improving learning outcomes, student success and supporting strategic decision making in higher educational institutions of learning. However, the widely used data analytics based clustering algorithms are highly data dependent, making it pertinent to find the most effective algorithm for knowledge mining in a dataset associated with student engagement. In this study, performances of five famous clustering algorithms are evaluated for this purpose. The k-means algorithm was benchmarked with 22 distance functions based on the Silhouette index, Dunn's index and partition entropy internal validity metrics. The hierarchical clustering algorithm was benchmarked with the Cophenetic correlation coefficient computed for different combinations of distance and linkage functions. The Fuzzy c-means algorithm was benchmarked with the partition entropy, partition coefficient, Silhouette index and modified partition coefficient. The k-nearest neighbor algorithm was applied to determine the optimum epsilon value for the density-based spatial clustering of applications with noise. The default parameter settings were accepted for the expectation-maximization algorithm. The overall ranking of the clustering algorithms was based on cluster potentiality using the median deviation statistics. The results of the evaluation show the well-known k-means algorithm to have the highest cluster potentiality, demonstrating its effectiveness for the task of knowledge mining in a student engagement dataset.

机译：基于数据分析的算法在学生数据集中的知识挖掘任务的应用是提高学习成果，学生成功和支持高等教育学习机构的战略决策的重要策略。然而，广泛使用的基于数据分析的聚类算法是高度数据所依赖的，使得它有关在与学生参与相关的数据集中找到最有效的知识挖掘算法。在这项研究中，为此目的评估了五种着名聚类算法的性能。基于轮廓索引，DUNN的索引和分区熵内部有效度量，K-means算法与22个距离功能进行了基准测试。分层聚类算法与计算距离和连杆功能的不同组合计算的CopEnenetic相关系数基准测试。模糊C均值算法与分区熵，分区系数，剪影索引和修改分区系数基准测试。应用K最近邻算法以确定具有噪声的基于密度的空间聚类的最佳ePsilon值。默认参数设置被接受了预期最大化算法。群集算法的整体排名基于使用中值偏差统计的集群潜力。评估结果显示了众所周知的K-mean算法，以具有最高的集群潜力，展示了其在学生参与数据集中的知识挖掘任务的有效性。

著录项

来源
《Intelligent data analysis》 |2019年第5期|1055-1071|共17页
作者
Oladipupo O. O.; Olugbara O. O.;
展开▼
作者单位

Durban Univ Technol ICT & Soc Res Grp POB 1334 ZA-4000 Durban South Africa;

Durban Univ Technol ICT & Soc Res Grp POB 1334 ZA-4000 Durban South Africa;

展开▼
收录信息美国《科学引文索引》(SCI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Algorithm evaluation; data analytics; data clustering; knowledge mining; student engagement;

机译：算法评估;数据分析;数据聚类;知识挖掘;学生参与;

相似文献

外文文献
中文文献
专利

1. Discovery of Patterns and evaluation of Clustering Algorithms in SocialNetwork Data (Face book 100 Universities) through Data Mining Techniques and Methods [J] . Nancy.P, R.Geetha Ramani International Journal of Data Mining & Knowledge Management Process . 2012,第5期

机译：通过数据挖掘技术和方法发现社交网络数据（Facebook 100大学）中的模式并评估聚类算法
2. Construction of Student Information Management System Based on Data Mining and Clustering Algorithm [J] . XueHong Yin Complexity . 2021,第a期

机译：基于数据挖掘和聚类算法的学生信息管理系统构建
3. Scalable Hyperspace Partitioning Based Data Preprocessing Algorithm for Distance-Metric Based Clustering in Data Mining [J] . Manju Pandey, Ravi K. Jade International Journal of Applied Engineering Research . 2017,第12aPta3期

机译：基于距离公制的数据挖掘距离的基于距离的基于距离的数据预处理算法
4. Big Data Mining or Turning Data Mining into Predictive Analytics from Large-Scale 3Vs Data: The Future Challenge for Knowledge Discovery [C] . Alfredo Cuzzocrea International conference on model and data engineering . 2014

机译：大数据挖掘或将数据挖掘从大型3V数据转变为预测分析：知识发现的未来挑战
5. Achieving consumable big data analytics by distributing data mining algorithms. [D] . Khalifa, Shady Samir Mohamed. 2017

机译：通过分发数据挖掘算法来实现消耗性大数据分析。
6. Knowledge Discovery for Higher Education Student Retention Based on Data Mining: Machine Learning Algorithms and Case Study in Chile [O] . Carlos A. Palacios, José A. Reyes-Suárez, Lorena A. Bearzotti, 2021

机译：基于数据挖掘的高等教育学生保留知识发现：智利机器学习算法和案例研究
7. Knowledge Discovery for Higher Education Student Retention Based on Data Mining: Machine Learning Algorithms and Case Study in Chile [O] . Carlos A. Palacios, José A. Reyes-Suárez, Lorena A. Bearzotti, 2021

机译：基于数据挖掘的高等教育学生保留知识发现：智利机器学习算法和案例研究
8. Cluster Analysis-Based Approaches for Geospatiotemporal Data Mining of Massive Data Sets for Identification of Forest Threats. [R] . Mills, R. T., Hoffman, F. M., Kumar, J., 2011

机译：基于聚类分析的海量数据集地理时空数据挖掘方法用于森林威胁识别。

Evaluation of data analytics based clustering algorithms for knowledge mining in a student engagement data

摘要

著录项

相似文献

相关主题

期刊订阅