首页> 外文期刊>Soft computing: A fusion of foundations, methodologies and applications >Quasi-cluster centers clustering algorithm based on potential entropy and t-distributed stochastic neighbor embedding
【24h】

Quasi-cluster centers clustering algorithm based on potential entropy and t-distributed stochastic neighbor embedding

机译:基于电位熵和T分布式随机邻居嵌入的准簇中心聚类算法

获取原文
获取原文并翻译 | 示例
           

摘要

A novel density-based clustering algorithm named QCC is presented recently. Although the algorithm has proved its strong robustness, it is still necessary to manually determine the two input parameters, including the number of neighbors (k) and the similarity threshold value (), which severely limits the promotion of the algorithm. In addition, the QCC does not perform excellently when confronting the datasets with relatively high dimensions. To overcome these defects, firstly, we define a new method for computing local density and introduce the strategy of potential entropy into the original algorithm. Based on this idea, we propose a new QCC clustering algorithm (QCC-PE). QCC-PE can automatically extract optimal value of the parameter k by optimizing potential entropy of data field. By this means, the optimized parameter can be calculated from the datasets objectively rather than the empirical estimation accumulated from a large number of experiments. Then, t-distributed stochastic neighbor embedding (tSNE) is applied to the model of QCC-PE and further brings forward a method based on tSNE (QCC-PE-tSNE), which preprocesses high-dimensional datasets by dimensionality reduction technique. We compare the performance of the proposed algorithms with QCC, DBSCAN, and DP in the synthetic datasets, Olivetti Face Database, and real-world datasets respectively. Experimental results show that our algorithms are feasible and effective and can often outperform the comparisons.
机译:最近介绍了名为QCC的新型基于密度的聚类算法。尽管算法证明了其强大的稳健性,但仍然需要手动确定两个输入参数,包括邻居(k)的数量和相似度阈值(),其严重限制算法促销。此外,在面对具有相对高维度的数据集时,QCC不会出色。为了克服这些缺陷,首先,我们定义了一种用于计算本地密度的新方法,并将潜在熵的策略引入原始算法。基于这个想法,我们提出了一种新的QCC聚类算法(QCC-PE)。通过优化数据字段的潜在熵,QCC-PE可以自动提取参数k的最佳值。通过这种方式,可以客观地从数据集计算优化参数而不是从大量实验中累积的经验估计来计算。然后,将T分布式随机邻居嵌入(TSNE)应用于QCC-PE的模型,并进一步推动基于TSNE(QCC-PE-TSNE)的方法,其通过维度减少技术预处理高维数据集。我们将建议的算法与QCC,DBSCAN和DP中提出的算法分别进行比较分别。实验结果表明,我们的算法是可行且有效的,并且通常可以优于比较。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号