首页> 美国卫生研究院文献>other >Massively parallel unsupervised single-particle cryo-EM data clustering via statistical manifold learning
【2h】

Massively parallel unsupervised single-particle cryo-EM data clustering via statistical manifold learning

机译:通过统计流形学习进行大规模并行无监督单粒子低温电磁数据聚类

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Structural heterogeneity in single-particle cryo-electron microscopy (cryo-EM) data represents a major challenge for high-resolution structure determination. Unsupervised classification may serve as the first step in the assessment of structural heterogeneity. However, traditional algorithms for unsupervised classification, such as K-means clustering and maximum likelihood optimization, may classify images into wrong classes with decreasing signal-to-noise-ratio (SNR) in the image data, yet demand increased computational costs. Overcoming these limitations requires further development of clustering algorithms for high-performance cryo-EM data processing. Here we introduce an unsupervised single-particle clustering algorithm derived from a statistical manifold learning framework called generative topographic mapping (GTM). We show that unsupervised GTM clustering improves classification accuracy by about 40% in the absence of input references for data with lower SNRs. Applications to several experimental datasets suggest that our algorithm can detect subtle structural differences among classes via a hierarchical clustering strategy. After code optimization over a high-performance computing (HPC) environment, our software implementation was able to generate thousands of reference-free class averages within hours in a massively parallel fashion, which allows a significant improvement on ab initio 3D reconstruction and assists in the computational purification of homogeneous datasets for high-resolution visualization.
机译:单粒子低温电子显微镜(cryo-EM)数据中的结构异质性代表了高分辨率结构测定的主要挑战。无监督分类可能是评估结构异质性的第一步。但是,用于无监督分类的传统算法(例如K均值聚类和最大似然优化)可能会通过降低图像数据中的信噪比(SNR)将图像分类为错误的类别,但需要增加计算成本。要克服这些限制,就需要进一步开发用于高性能低温电磁数据处理的聚类算法。在这里,我们介绍了一种从称为生成地形图(GTM)的统计流形学习框架派生的无监督单粒子聚类算法。我们表明,在没有输入参考的情况下,无监督GTM聚类可将分类精度提高约40%,而SNR较低。在几个实验数据集上的应用表明,我们的算法可以通过分层聚类策略检测类之间的细微结构差异。在高性能计算(HPC)环境上对代码进行优化后,我们的软件实现能够在数小时内以大规模并行方式生成数千个无参考类平均数,从而可以从头开始进行3D重构,并在很大程度上帮助实现同类数据集的计算纯化,用于高分辨率可视化。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号