首页> 外文期刊>Computational Biology and Bioinformatics, IEEE/ACM Transactions on >Unsupervised Structure Detection in Biomedical Data
【24h】

Unsupervised Structure Detection in Biomedical Data

机译:生物医学数据中的无监督结构检测

获取原文
获取原文并翻译 | 示例
           

摘要

A major challenge in computational biology is to find simple representations of high-dimensional data that best reveal the underlying structure. In this work, we present an intuitive and easy-to-implement method based on ranked neighborhood comparisons that detects structure in unsupervised data. The method is based on ordering objects in terms of similarity and on the mutual overlap of nearest neighbors. This basic framework was originally introduced in the field of social network analysis to detect actor communities. We demonstrate that the same ideas can successfully be applied to biomedical data sets in order to reveal complex underlying structure. The algorithm is very efficient and works on distance data directly without requiring a vectorial embedding of data. Comprehensive experiments demonstrate the validity of this approach. Comparisons with state-of-the-art clustering methods show that the presented method outperforms hierarchical methods as well as density based clustering methods and model-based clustering. A further advantage of the method is that it simultaneously provides a visualization of the data. Especially in biomedical applications, the visualization of data can be used as a first pre-processing step when analyzing real world data sets to get an intuition of the underlying data structure. We apply this model to synthetic data as well as to various biomedical data sets which demonstrate the high quality and usefulness of the inferred structure.
机译:计算生物学的主要挑战是找到最能揭示底层结构的高维数据的简单表示形式。在这项工作中,我们提出了一种基于排名的邻域比较的直观且易于实现的方法,该方法可以检测非监督数据中的结构。该方法基于相似性和最近邻的相互重叠对对象进行排序。此基本框架最初是在社交网络分析领域中引入的,用于检测参与者社区。我们证明了相同的想法可以成功地应用于生物医学数据集,以揭示复杂的基础结构。该算法非常有效,可以直接处理距离数据,而无需向量的数据嵌入。全面的实验证明了这种方法的有效性。与最新聚类方法的比较表明,该方法优于分层方法,基于密度的聚类方法和基于模型的聚类。该方法的另一个优点是它同时提供了数据的可视化。尤其是在生物医学应用中,数据的可视化可用作分析现实世界数据集以获取基本数据结构直觉时的第一个预处理步骤。我们将此模型应用于合成数据以及各种生物医学数据集,这些数据证明了推断结构的高质量和实用性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号