Scalable clustering with adaptive instance sampling

机译：具有自适应实例采样的可扩展聚类

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Most of the clustering algorithms are affected by the number of attributes and instances with respect to the computation time. Thus, the data mining community has made efforts to enable induction of the clustering efficient. Hence, scalability is naturally a critical issue that the data mining community faces. A method to handle this issue is to use a subset of all instances. This paper suggests an algorithm that enables to perform clustering efficiently. This is done by using nested partitions method for solving the noisy performance problems, which arises when using a subset of instances and adjusting the sample rate properly at each iteration. This Adaptive NPCLUSTER algorithm had better similarity in small dataset and had worse similarity in large dataset than NPCLUSTER, but it had shorter computation time than NPCLUSTER.

机译：相对于计算时间，大多数聚类算法受属性和实例数量的影响。因此，数据挖掘社区已尽力使归纳有效。因此，可伸缩性自然是数据挖掘社区面临的关键问题。解决此问题的方法是使用所有实例的子集。本文提出了一种能够有效执行聚类的算法。这是通过使用嵌套分区方法解决嘈杂的性能问题而实现的，该问题是在使用实例子集并在每次迭代中适当调整采样率时出现的。与NPCLUSTER相比，这种自适应NPCLUSTER算法在小数据集中具有更好的相似性，在大数据集中具有较差的相似性，但是其计算时间比NPCLUSTER短。

著录项

来源
《IEEE International Conference on Industrial Engineering and Engineering Management》|2013年|1309-1313|共5页
会议地点
作者
JaeKyung Yang; ByoungJin Yu; MyoungJin Choi;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
data mining; iterative methods; pattern clustering; sampling methods; adaptive NPCLUSTER algorithm; adaptive instance sampling; data mining community; iteration method; nested partitions method; noisy performance problems; scalable clustering algorithm; Algorithm design and analysis; Clustering algorithms; Data mining; Databases; Noise; Partitioning algorithms; Scalability; Adaptive Sampling; Clustering; Data Mining; Metaheuristics; Nested Partition;

机译：数据挖掘;迭代方法;模式聚类;采样方法;自适应NPCLUSTER算法;自适应实例采样;数据挖掘社区;迭代方法;嵌套分区方法;噪声性能问题;可伸缩聚类算法;算法设计与分析;聚类算法;数据挖掘;数据库;噪声;分区算法;可伸缩性;自适应采样;聚类;数据挖掘;元启发式;嵌套分区;

相似文献

外文文献
中文文献
专利

1. A sample-based hierarchical adaptive K-means clustering method for large-scale video retrieval [J] . Kaiyang Liao, Guizhong Liu, Li Xiao, Knowledge-Based Systems . 2013,第sepa期

机译：用于大规模视频检索的基于样本的分层自适应K均值聚类方法
2. Adaptive rectangular sampling: An easy, incomplete, neighbourhood-free adaptive cluster sampling design [J] . Panahbehagh Bardia Survey methodology . 2016,第2期

机译：自适应矩形采样：简单，不完整，无邻域的自适应集群采样设计
3. Optimization-based feature selection with adaptive instance sampling [J] . Jaekyung Yang, Sigurdur Olafsson Computers & operations research . 2006,第11期

机译：基于自适应实例采样的基于优化的特征选择
4. Scalable Mixed-Paradigm Trace Clustering using Super-Instances [C] . Pieter De Koninck, Jochen De Weerdt 2019 International Conference on Process Mining . 2019

机译：使用超级实例的可扩展混合天堂跟踪群集
5. The modifiable areal unit problem (MAUP) via cluster analysis methodologies: A look at scale, zoning, and instances of foreclosure in Los Angeles County. [D] . Davis, Matthew W. 2012

机译：通过聚类分析方法可修改的面积单位问题（MAUP）：洛杉矶县的规模，分区和止赎房屋实例。
6. Multiple Objects Fusion Tracker Using a Matching Network for Adaptively Represented Instance Pairs [O] . Sang-Il Oh, Hang-Bong Kang 2017

机译：使用匹配网络的自适应表示实例对的多对象融合跟踪器
7. Clustering on very small scales from a large sample of confirmed quasar pairs: does quasar clustering track from Mpc to kpc scales? [O] . Eftekharzadeh, S., Myers, A. D., Hennawi, J. F., 2017

机译：从大量已确认的类星体样本中以很小的尺度进行聚类：类星体聚类是否从Mpc到kpc尺度？

Scalable clustering with adaptive instance sampling

摘要

著录项

相似文献

相关主题

期刊订阅