首页> 外文会议>International joint conference on computational intelligence >Efficient Approaches for Solving the Large-Scale k-Medoids Problem: Towards Structured Data
【24h】

Efficient Approaches for Solving the Large-Scale k-Medoids Problem: Towards Structured Data

机译:解决大规模k-Medoid问题的有效方法:走向结构化数据

获取原文

摘要

The possibility of clustering objects represented by structured data with possibly non-trivial geometry certainly is an interesting task in pattern recognition. Moreover, in the Big Data era, the possibility of clustering huge amount of (structured) data challenges computer science and pattern recognition researchers alike. The aim of this paper is to bridge the gap on large-scale structured data clustering. Specifically, following a previous work, in this paper a parallel and distributed k-medoids clustering implementation is proposed and tested on real-world biological structured data, namely pathway maps (graphs) and primary structure of proteins (sequences). Furthermore, two methods for medoids' evaluation are proposed and compared in terms of scalability, based on exact and approximate procedures, respectively. Computational results show that the proposed implementation is flexible with respect to the dissimilarity measure and the input space adopted, with satisfactory results in terms of scalability.
机译:在结构识别中,将由结构化数据表示的对象与可能不平凡的几何形状聚在一起的可能性无疑是一项有趣的任务。此外,在大数据时代,聚集大量(结构化)数据的可能性对计算机科学和模式识别研究人员同样构成了挑战。本文的目的是弥合大规模结构化数据集群的鸿沟。具体而言,在先前的工作之后,本文提出了一种并行且分布式的k-medoids聚类实施方案,并在现实世界中的生物结构化数据(即路径图(图)和蛋白质的一级结构(序列))上进行了测试。此外,提出了两种评估类固醇的方法,并分别基于精确程序和近似程序在可扩展性方面进行了比较。计算结果表明,所提出的实现在相异性度量和采用的输入空间方面具有灵活性,在可伸缩性方面具有令人满意的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号