COBRAS: Interactive Clustering with Pairwise Queries

机译：COBRAS：具有成对查询的交互式聚类

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Constraint-based clustering algorithms exploit background knowledge to construct clusterings that are aligned with the interests of a particular user. This background knowledge is often obtained by allowing the clustering system to pose pairwise queries to the user: should these two elements be in the same cluster or not? Answering yes results in a must-link constraint, no in a cannot-link. Ideally, the user should be able to answer a couple of these queries, inspect the resulting clustering, and repeat these two steps until a satisfactory result is obtained. Such an interactive clustering process requires the clustering system to satisfy three requirements: (1) it should be able to present a reasonable (intermediate) clustering to the user at any time, (2) it should produce good clusterings given few queries, i.e. it should be query-efficient, and (3) it should bo time-efficient. We present COBRAS, an approach to clustering with pairwise constraints that satisfies these requirements. COBRAS constructs clusterings of super-instances, which are local regions in the data in which all instances are assumed to belong to the same cluster. By dynamically refining these super-instances during clustering, COBRAS is able to produce clusterings at increasingly fine-grained levels of granularity. It quickly produces good high-level clusterings, and is able to refine them to find more detailed structure as more queries are answered. In our experiments we demonstrate that COBRAS is the only method able to produce good solutions at all stages of the clustering process at fast runtimes, and hence the most suitable method for interactive clustering.

机译：基于约束的聚类算法利用背景知识来构建符合特定用户兴趣的聚类。通常通过允许集群系统向用户提出成对查询来获得这种背景知识：这两个元素是否应该在同一个集群中？回答是会导致必须链接约束，否会导致无法链接约束。理想情况下，用户应该能够回答其中的几个查询，检查结果聚类并重复这两个步骤，直到获得满意的结果。这种交互式集群过程要求集群系统满足三个要求：（1）它应该能够随时向用户提供合理的（中间）集群;（2）在很少查询的情况下，它应该产生良好的集群，即应该是查询有效的，并且（3）应该是节省时间的。我们提出了COBRAS，一种通过成对约束进行聚类的方法，可以满足这些要求。 COBRAS构造超实例的群集，超实例是数据中假定所有实例都属于同一群集的局部区域。通过在聚类期间动态细化这些超级实例，COBRAS能够以越来越细的粒度级别生成聚类。它可以快速生成良好的高层聚类，并且能够随着对更多查询的回答而对其进行细化，以找到更详细的结构。在我们的实验中，我们证明了COBRAS是能够在快速运行时的聚类过程的所有阶段产生良好解决方案的唯一方法，因此是交互式聚类的最合适方法。

著录项

来源
《International symposium on intelligent data analysis》|2017年|353-366|共14页
会议地点
作者
Toon Van Craenendonck; Sebastijan Dumancic; Elia Van Wolputte; Hendrik Blockeel;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Semi-supervised clustering; Pairwise constraints; Active clustering;

机译：半监督聚类;成对约束;主动集群;

相似文献

外文文献
中文文献
专利

1. Interactive Query Expansion With the Use of Clustering-by-Directions Algorithm [J] . Kaczmarek A. L. Industrial Electronics, IEEE Transactions on . 2011,第8期

机译：使用方向聚类的交互式查询扩展
2. Analysis and comparison of declustering schemes for interactive navigation queries [J] . Chen C.-M., Sinha R.K. IEEE Transactions on Knowledge and Data Engineering . 2000,第5期

机译：交互式导航查询的分簇方案的分析与比较
3. Query grouping–based multi-query optimization frameworkrnfor interactive SQL query engines on Hadoop [J] . Ling Chen, Yan Lin, JingchangWang, Concurrency and computation: practice and experience . 2018,第19期

机译：基于查询分组的多查询优化框架，用于Hadoop上的交互式SQL查询引擎
4. COBRAS: Interactive Clustering with Pairwise Queries [C] . Toon Van Craenendonck, Sebastijan Dumancic, Elia Van Wolputte, International Symposium on Intelligent Data Analysis . 2018

机译：COBRAS：与配对查询的交互式聚类
5. Bayesian estimation of a potential function in a pairwise interacting point process. [D] . Bognar, Matthew Allyn. 2002

机译：成对交互点过程中势函数的贝叶斯估计。
6. Finding Pairwise Intersections Inside a Query Range [O] . Mark de Berg, Joachim Gudmundsson, Ali D. Mehrabi -1

机译：在查询范围内查找成对相交
7. COBRA: A fast and simple method for active clustering with pairwise constraints [O] . Van Craenendonck Toon, Dumancic Sebastijan, Blockeel Hendrik 2017

机译：COBRA：具有成对约束的主动聚类的快速，简单方法

COBRAS: Interactive Clustering with Pairwise Queries

摘要

著录项

相似文献

相关主题

期刊订阅