首页> 外文期刊>Pattern Analysis and Applications >RHC: a non-parametric cluster-based data reduction for efficient k-NN classification
【24h】

RHC: a non-parametric cluster-based data reduction for efficient k-NN classification

机译:RHC:基于非参数聚类的数据约简,可实现高效的k-NN分类

获取原文
获取原文并翻译 | 示例
           

摘要

Although the k-NN classifier is a popular classification method, it suffers from the high computational cost and storage requirements it involves. This paper proposes two effective cluster-based data reduction algorithms for efficient k-NN classification. Both have low preprocessing cost and can achieve high data reduction rates while maintaining k-NN classification accuracy at high levels. The first proposed algorithm is called reduction through homogeneous clusters (RHC) and is based on a fast preprocessing clustering procedure that creates homogeneous clusters. The centroids of these clusters constitute the reduced training set. The second proposed algorithm is a dynamic version of RHC that retains all its properties and, in addition, it can manage datasets that cannot fit in main memory and is appropriate for dynamic environments where new training data are gradually available. Experimental results, based on fourteen datasets, illustrate that both algorithms are faster and achieve higher reduction rates than four known methods, while maintaining high classification accuracy.
机译:尽管k-NN分类器是一种流行的分类方法,但是它遭受着高昂的计算成本和涉及的存储需求。本文提出了两种有效的基于聚类的数据约简算法,以实现高效的k-NN分类。两者都具有较低的预处理成本,并且可以在保持高水平k-NN分类精度的同时实现较高的数据缩减率。第一个提出的算法称为通过均质聚类(RHC)缩减,它基于创建均质聚类的快速预处理聚类过程。这些簇的质心构成了简化的训练集。提出的第二种算法是RHC的动态版本,它保留了其所有属性,此外,它还可以管理无法容纳在主存储器中的数据集,适用于逐渐可获得新训练数据的动态环境。基于14个数据集的实验结果表明,与4种已知方法相比,这两种算法均较快且实现了更高的还原率,同时保持了较高的分类精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号