首页>
外国专利>
DATA CLUSTERING METHOD AND DEVICE BASED ON K NEIGHBORHOOD SIMILARITY AS WELL AS STORAGE MEDIUM
DATA CLUSTERING METHOD AND DEVICE BASED ON K NEIGHBORHOOD SIMILARITY AS WELL AS STORAGE MEDIUM
展开▼
机译:基于k近邻相似度和存储介质的数据聚类方法和装置
展开▼
页面导航
摘要
著录项
相似文献
摘要
Disclosed is a data clustering method based on K neighborhood similarity, comprising ordering data points to be clustered based on the maximum radius of a K neighborhood of the data points to be clustered, i.e., the density, and performing a first cycle on the data points ordered in ascending fashion, and merging the data points that conform to the statistics similarity into the same cluster; and performing a second cycle on the data points with smaller clustering density according to the required clustering scale, finding out all noise points, and merging non-noise points into the closest high density cluster, thereby achieving data clustering. By means of the data clustering method based on K neighborhood similarity provided in embodiments of the present invention for data clustering, the number of clusters is not required to be preset, and probability distribution of data is not required to be known; parameters are easy to set, and settings of parameters are independent of density distribution and a distance scale of the data; the clusters are formed by gradually merging from high density to low density, and a hierarchical relationship between the clusters is provided during clustering generation.
展开▼