首页> 外国专利> DATA CLUSTERING METHOD AND DEVICE BASED ON K NEIGHBORHOOD SIMILARITY AS WELL AS STORAGE MEDIUM

DATA CLUSTERING METHOD AND DEVICE BASED ON K NEIGHBORHOOD SIMILARITY AS WELL AS STORAGE MEDIUM

机译：基于k近邻相似度和存储介质的数据聚类方法和装置

页面导航

摘要
著录项
相似文献

摘要

Disclosed is a data clustering method based on K neighborhood similarity, comprising ordering data points to be clustered based on the maximum radius of a K neighborhood of the data points to be clustered, i.e., the density, and performing a first cycle on the data points ordered in ascending fashion, and merging the data points that conform to the statistics similarity into the same cluster; and performing a second cycle on the data points with smaller clustering density according to the required clustering scale, finding out all noise points, and merging non-noise points into the closest high density cluster, thereby achieving data clustering. By means of the data clustering method based on K neighborhood similarity provided in embodiments of the present invention for data clustering, the number of clusters is not required to be preset, and probability distribution of data is not required to be known; parameters are easy to set, and settings of parameters are independent of density distribution and a distance scale of the data; the clusters are formed by gradually merging from high density to low density, and a hierarchical relationship between the clusters is provided during clustering generation.

机译：公开了一种基于K个邻域相似度的数据聚类方法，包括：基于要聚类的数据点的K个邻域的最大半径，即密度，对要聚类的数据点进行排序，并对所述数据点进行第一周期以升序排序，并将符合统计相似性的数据点合并到同一群集中;根据所需的聚类规模，对聚类密度较小的数据点进行第二个循环，找出所有噪声点，并将非噪声点合并为最近的高密度聚类，从而实现数据聚类。通过本发明实施例提供的基于K邻域相似度的数据聚类方法进行数据聚类，不需要预先设置聚类的数量，也不需要知道数据的概率分布;参数易于设置，参数设置与数据的密度分布和距离尺度无关。通过从高密度逐渐过渡到低密度逐渐形成集群，并且在集群生成过程中提供了集群之间的层次关系。

著录项

公开/公告号WO2019136929A1

专利类型
公开/公告日2019-07-18

原文格式PDF
申请/专利权人 HUIZHOU UNIVERSITY;
展开▼

申请/专利号WO2018CN91697
发明设计人 HUANG JINQIU;XU DEMING;WAN CHANGLIN;
展开▼

申请日2018-06-15
分类号G06K9/62;
国家 WO
入库时间 2022-08-21 11:53:56

相似文献

专利
外文文献
中文文献