首页> 外文期刊>International Journal of Intelligent Systems Technologies and Applications >SED-Stream: discriminative dimension selection for evolution-based clustering of high dimensional data streams
【24h】

SED-Stream: discriminative dimension selection for evolution-based clustering of high dimensional data streams

机译:SED-Stream:区分维度选择,用于基于演化的高维数据流聚类

获取原文
获取原文并翻译 | 示例
           

摘要

Clustering of high dimensional data streams become one of the most challenging data mining tasks. Our previous work, SE-Stream is a standard-deviation based projected clustering method to support high dimensional data streams. Besides its ability to find clusters within subgroups of dimensions, SE-Stream is able to monitor and detect change in the clustering structure during the progression of data streams. Extended from SE-Stream, some selected dimensions are used to represent the clusters. Our idea is to select a better set of dimensions to increase the quality of the output clustering. Our proposed SED-Stream projects any cluster to its discriminative dimensions that are highly relevant to the cluster itself but distinguished from the other clusters. Experimental results on both real-world and synthetic stream datasets show that SED-Stream is better than its previous version, SE-Stream, in terms of both purity and f-measure. Compared with HPStream, a state of the art algorithm for projected clustering of high dimensional data streams, SED-Stream outperforms HPStream in terms of f-measure, and has comparable purity.
机译:高维数据流的集群成为最具挑战性的数据挖掘任务之一。我们以前的工作SE-Stream是基于标准偏差的投影聚类方法,可支持高维数据流。 SE-Stream除了能够在维度的子组中查找聚类之外,还能够在数据流进行过程中监视和检测聚类结构的变化。从SE-Stream扩展而来,一些选定的维度用于表示群集。我们的想法是选择一组更好的维度以提高输出聚类的质量。我们提出的SED-Stream将任何群集投影到与群集本身高度相关但与其他群集区分开的判别维度。在现实和合成流数据集上的实验结果表明,SED-Stream在纯度和f度量方面均优于其先前版本SE-Stream。与用于高维数据流投影聚类的最新算法HPStream相比,SED-Stream在f度量方面优于HPStream,并且具有可比的纯度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号