AA-DBSCAN: an approximate adaptive DBSCAN for finding clusters with varying densities

Kim Jeong-Hun; Choi Jong-Hyeok; Yoo Kwan-Hee; Nasridinov Aziz

首页> 外文期刊>Journal of supercomputing >AA-DBSCAN: an approximate adaptive DBSCAN for finding clusters with varying densities

【24h】

AA-DBSCAN: an approximate adaptive DBSCAN for finding clusters with varying densities

机译：AA-DBSCAN：一种近似的自适应DBSCAN，用于查找具有不同密度的聚类

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Clustering is a typical data mining technique that partitions a dataset into multiple subsets of similar objects according to similarity metrics. In particular, density-based algorithms can find clusters of different shapes and sizes while remaining robust to noise objects. DBSCAN, a representative density-based algorithm, finds clusters by defining the density criterion with global parameters, epsilon-distance and MinPts. However, most density-based algorithms, including DBSCAN, find clusters incorrectly because the density criterion is fixed to the global parameters and misapplied to clusters of varying densities. Although studies have been conducted to determine optimal parameters or to improve clustering performance using additional parameters and computations, running time for clustering has been significantly increased, particularly when the dataset is large. In this study, we focus on minimizing the additional computation required to determine the parameters by using the approximate adaptive epsilon-distance for each density while finding the clusters with varying densities that DBSCAN cannot find. Specifically, we propose a new tree structure based on a quadtree to define a dataset density layer. In addition, we propose approximate adaptive DBSCAN (AA-DBSCAN) and kAA-DBSCAN that have clustering performance similar to those of existing algorithms for finding clusters with varying densities while significantly reducing the running time required to perform clustering. We evaluate the proposed algorithms, AA-DBSCAN and kAA-DBSCAN, via extensive experiments using the state-of-the-art algorithms. Experimental results demonstrate an improvement in clustering performance and reduction in running time of the proposed algorithms.

机译：聚类是一种典型的数据挖掘技术，可根据相似性指标将数据集分为相似对象的多个子集。特别地，基于密度的算法可以找到不同形状和大小的簇，同时对噪声对象保持鲁棒性。 DBSCAN是一种基于密度的代表性算法，它通过使用全局参数，ε距离和MinPts定义密度标准来查找聚类。但是，大多数基于密度的算法（包括DBSCAN）都无法正确找到簇，因为密度标准固定于全局参数，并错误地应用于密度不同的簇。尽管已经进行了研究以确定最佳参数或使用其他参数和计算来改善聚类性能，但是聚类的运行时间已显着增加，尤其是在数据集很大时。在这项研究中，我们专注于通过使用每个密度的近似自适应epsilon距离来最小化确定参数所需的额外计算，同时找到具有DBSCAN找不到的不同密度的簇。具体来说，我们提出了一种基于四叉树的新树结构，以定义数据集密度层。此外，我们提出了近似的自适应DBSCAN（AA-DBSCAN）和kAA-DBSCAN，它们的聚类性能类似于现有算法的聚类性能，可找到密度不同的聚类，同时大大减少了执行聚类所需的运行时间。通过使用最新算法的大量实验，我们对提出的算法AA-DBSCAN和kAA-DBSCAN进行了评估。实验结果证明了该算法在聚类性能上的改进和运行时间的减少。

著录项

来源
《Journal of supercomputing》 |2019年第1期|142-169|共28页
作者
Kim Jeong-Hun; Choi Jong-Hyeok; Yoo Kwan-Hee; Nasridinov Aziz;
展开▼
作者单位

Chungbuk Natl Univ, Dept Comp Sci, Cheongju 28644, South Korea;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Density-based clustering; DBSCAN; Approximation; Adaptation; Partitioning;

机译：基于密度的聚类;DBSCAN;逼近;自适应;分区;

相似文献

外文文献
中文文献
专利

1. AMF-IDBSCAN: Incremental Density Based Clustering Algorithm Using Adaptive Median Filtering Technique [J] . Aida Chefrour, Labiba Souici-Meslati Informatica: An International Journal of Computing and Informatics . 2019,第4期

机译：AMF-IDBSCAN：使用自适应中值滤波技术的基于增量密度的聚类算法
2. A Domain Adaptive Density Clustering Algorithm for Data With Varying Density Distribution [J] . Chen Jianguo, Yu Philip S. IEEE Transactions on Knowledge and Data Engineering . 2021,第6期

机译：具有不同密度分布的数据的域自适应密度聚类算法
3. Improved density peak clustering-based adaptive Gaussian mixture model for damage monitoring in aircraft structures under time-varying conditions [J] . Qiu Lei, Fang Fang, Yuan Shenfang Mechanical systems and signal processing . 2019,第JULa1期

机译：改进的基于密度峰值聚类的自适应高斯混合模型，用于时变条件下飞机结构的损伤监测
4. ADBSCAN: Adaptive Density-Based Spatial Clustering of Applications with Noise for Identifying Clusters with Varying Densities [C] . Mohammad Mahmudur Rahman Khan, Md. Abu Bakr Siddique, Rezoana Bente Arif, International Conference on Electrical Engineering and Information Communication Technology . 2018

机译：ADBSCAN：具有噪声的应用程序的基于密度的自适应空间聚类，用于识别密度变化的聚类
5. Towards finding the complete modulome: Density constrained biclustering. [D] . Colak, Recep. 2008

机译：寻求找到完整的模量组：密度受限的双簇。
6. Finding approximate gene clusters with Gecko 3 [O] . Sascha Winter, Katharina Jahn, Stefanie Wehner, 2016

机译：使用Gecko 3查找近似的基因簇
7. ADBSCAN: Adaptive Density-Based Spatial Clustering of Applications with Noise for Identifying Clusters with Varying Densities [O] . Mohammad Mahmudur Rahman Khan, Md. Abu Bakr Siddique, Rezoana Bente Arif, 2018

机译：ADBSCAN：基于自适应的基于密度的空间聚类，用于识别具有不同密度的簇的噪声
8. Adaptive Estimation and Approximation of Continuously Varying Spectral DensityFunctions to Airborne Radar [R] . Emre, E. 1993

机译：机载雷达连续变化谱密度函数的自适应估计与逼近

AA-DBSCAN: an approximate adaptive DBSCAN for finding clusters with varying densities

摘要

著录项

相似文献

相关主题

期刊订阅