...
首页> 外文期刊>Ecological informatics: an international journal on ecoinformatics and computational ecology >Assessing the efficiency of clustering algorithms and goodness-of-fit measures using phytoplankton field data
【24h】

Assessing the efficiency of clustering algorithms and goodness-of-fit measures using phytoplankton field data

机译:使用浮游植物场数据评估聚类算法的效率和拟合优度

获取原文
获取原文并翻译 | 示例
           

摘要

Investigation of patterns in beta diversity has received increased attention over the last years particularly in light of new ecological theories such as the metapopulation paradigm and metacommunity theory. Traditionally, beta diversity patterns can be described by cluster analysis (i.e. dendrograms) that enables the classification of samples. Clustering algorithms define the structure of dendrograms, consequently assessing their performance is crucial. A common, although not always appropriate approach for assessing algorithm suitability is the cophenetic correlation coefficient. c. Alternatively the 2-norm has been recently proposed as an increasingly informative method for evaluating the distortion engendered by clustering algorithms. In the present work, the 2-norm is applied for the first time on field data and is compared with the cophenetic correlation coefficient using a set of 105 pairwise combinations of 7 clustering methods (e.g. UPGMA) and 15 (dis)similarity/distance indices (e.g. Jaccard index). In contrast to the 2-norm, cophenetic correlation coefficient does not provide a clear indication on the efficiency of the clustering algorithms for all combinations. The two approaches were not always in agreement in the choice of the most faithful algorithm. Additionally, the 2-norm revealed that UPGMA is the most efficient clustering algorithm and Ward's the least. The present results suggest that goodness-of-fit measures such as the 2-norm should be applied prior to clustering analyses for reliable beta diversity measures.
机译:在过去的几年中,特别是根据新的生态学理论(例如,种群分布范式和元社区理论),对β多样性模式的研究受到了越来越多的关注。传统上,可以通过能够对样本进行分类的聚类分析(即树状图)来描述β多样性模式。聚类算法定义了树状图的结构,因此评估其性能至关重要。共同的相关系数是评估算法适用性的通用方法,尽管并非总是适当的方法。 C。备选地,最近已经提出了2-范数作为用于评估由聚类算法引起的失真的越来越有用的方法。在本工作中,首次对野外数据应用2范数,并使用7种聚类方法(例如UPGMA)和15种(不相似)/距离指数的105组成对组合将其与同色相关系数进行比较(例如,Jaccard索引)。与2范数相比,同位相关系数并未对所有组合的聚类算法的效率提供明确的指示。在选择最忠实的算法时,这两种方法并不总是一致的。此外,2-范数还表明UPGMA是最有效的聚类算法,而Ward则是最低的。目前的结果表明,在进行可靠的β多样性测量的聚类分析之前,应采用拟合优度测量(例如2-范数)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号