...
首页> 外文期刊>Psychological Methods >Using Anticlustering to Partition Data Sets Into Equivalent Parts
【24h】

Using Anticlustering to Partition Data Sets Into Equivalent Parts

机译:使用反群集将数据集划分为等效零件

获取原文
获取原文并翻译 | 示例
           

摘要

Numerous applications in psychological research require that a pool of elements is partitioned into multiple parts. While many applications seek groups that are well-separated, that is, dissimilar from each other, others require the different groups to be as similar as possible. Examples include the assignment of students to parallel courses, assembling stimulus sets in experimental psychology, splitting achievement tests into parts of equal difficulty, and dividing a data set for cross-validation. We present anticlust, an easy-to-use and free software package for solving these problems fast and in an automated manner. The package anticlust is an open source extension to the R programming language and implements the methodology of anticlustering. Anticlustering divides elements into similar parts, ensuring similarity between groups by enforcing heterogeneity within groups. Thus, anticlustering is the direct reversal of cluster analysis that aims to maximize homogeneity within groups and dissimilarity between groups. Our package anticlust implements 2 anticlustering criteria, reversing the clustering methods k-means and cluster editing, respectively. In a simulation study, we show that anticlustering returns excellent results and outperforms alternative approaches like random assignment and matching. In 3 example applications, we illustrate how to apply anticlust on real data sets. We demonstrate how to assign experimental stimuli to equivalent sets based on norming data, how to divide a large data set for cross-validation, and how to split a test into parts of equal item difficulty and discrimination.
机译:心理学研究中的许多应用要求将一组元素分为多个部分。尽管许多应用程序都寻求分离良好的群体,也就是说,彼此不同,但其他应用程序要求不同的群体尽可能相似。示例包括将学生分配到平行课程中,在实验心理学中组装刺激集,将成就测试分为相等难度的一部分,并将数据集划分用于交叉验证。我们提出Anticlust,这是一个易于使用和免费的软件包,可快速和自动化的方式解决这些问题。软件包是对R编程语言的开源扩展,并实现了反群集的方法。反群集将元素划分为相似的部分,从而通过在组内执行异质性来确保组之间的相似性。因此,抗簇是聚类分析的直接逆转,旨在最大程度地提高群体内部的同质性和两组之间的相似性。我们的包装副套件分别实现了2个反聚类标准,分别逆转聚类方法K-均值和聚类编辑。在一项仿真研究中,我们表明,抗群集会回报出色的结果,并优于随机分配和匹配等替代方法。在3个示例应用程序中,我们说明了如何在真实数据集上应用反lust。我们演示了如何根据规范数据将实验刺激分配给等效集,如何将大型数据集分配以进行交叉验证以及如何将测试分为相等的项目难度和歧视的一部分。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号