首页> 外文会议>IEEE International Conference on Fuzzy Systems >d-FuzzStream: A Dispersion-Based Fuzzy Data Stream Clustering
【24h】

d-FuzzStream: A Dispersion-Based Fuzzy Data Stream Clustering

机译:d-FuzzStream:基于分散的模糊数据流聚类

获取原文

摘要

Fuzzy clustering algorithms have recently been investigated as appropriate techniques to extract knowledge from Data Streams due to their unsupervised nature and flexibility to deal with changes in the distribution of data. While most fuzzy clustering algorithms for Data Streams are based on chunks, the FuzzStream algorithm, proposed before by the authors of this paper, pioneered a fuzzy extension of a different approach known as the Online-Offline Framework (OOF). The extended framework, named Fuzzy Online-Offline Framework (FOOF), includes two steps known as fuzzy abstraction and fuzzy clustering. The fuzzy abstraction step continuously summarizes data in a set of cluster features called Fuzzy Micro Cluster (FMiC). Then, these FMiCs are later clustered in the fuzzy clustering step to generate the data partition. Although FuzzStream has shown to be more robust than other OOF-based algorithms, the fuzzy abstraction process in the algorithm overly reduces the data summarization, almost producing one FMiC for each example, also suffering from high overlapping FMiCs. Furthermore, the algorithm has a long processing time due to its need to calculate membership matrices for every example. In this paper we propose the d-FuzzStream algorithm, an adaptation of FuzzStream using the concepts of fuzzy dispersion and fuzzy similarity in order to improve the data summarization while minimizing the complexity of the algorithm. Experiments showed that the proposed algorithm generates FMiCs with higher representativeness and lower execution time than its original version, still producing similar clustering results.
机译:由于模糊聚类算法的不受监督的性质和处理数据分布变化的灵活性,最近已经研究了模糊聚类算法作为从数据流中提取知识的适当技术。尽管大多数用于数据流的模糊聚类算法都是基于块的,但本文作者之前提出的FuzzStream算法却开创了另一种方法的模糊扩展,称为在线-离线框架(OOF)。扩展框架名为模糊在线-离线框架(FOOF),包括两个步骤,称为模糊抽象和模糊聚类。模糊抽象步骤连续汇总称为模糊微簇(FMiC)的一组簇特征中的数据。然后,这些FMiC稍后在模糊聚类步骤中聚类以生成数据分区。尽管FuzzStream已显示出比其他基于OOF的算法更强大,但该算法中的模糊抽象过程过度减少了数据汇总,每个示例几乎产生一个FMiC,同时还存在高度重叠的FMiC。此外,由于该算法需要计算每个示例的隶属度矩阵,因此具有较长的处理时间。在本文中,我们提出了d-FuzzStream算法,它是使用模糊分散和模糊相似性的概念对FuzzStream进行的改编,目的是在最小化算法复杂度的同时改善数据汇总。实验表明,与原始算法相比,该算法生成的FMiC具有更高的代表性和更低的执行时间,仍能产生相似的聚类结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号