首页> 美国卫生研究院文献>Bioinformatics >A spectral clustering-based method for identifying clones from high-throughput B cell repertoire sequencing data
【2h】

A spectral clustering-based method for identifying clones from high-throughput B cell repertoire sequencing data

机译:基于光谱聚类的从高通量B细胞库谱数据中鉴定克隆的方法

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

MotivationB cells derive their antigen-specificity through the expression of Immunoglobulin (Ig) receptors on their surface. These receptors are initially generated stochastically by somatic re-arrangement of the DNA and further diversified following antigen-activation by a process of somatic hypermutation, which introduces mainly point substitutions into the receptor DNA at a high rate. Recent advances in next-generation sequencing have enabled large-scale profiling of the B cell Ig repertoire from blood and tissue samples. A key computational challenge in the analysis of these data is partitioning the sequences to identify descendants of a common B cell (i.e. a clone). Current methods group sequences using a fixed distance threshold, or a likelihood calculation that is computationally-intensive. Here, we propose a new method based on spectral clustering with an adaptive threshold to determine the local sequence neighborhood. Validation using simulated and experimental datasets demonstrates that this method has high sensitivity and specificity compared to a fixed threshold that is optimized for these measures. In addition, this method works on datasets where choosing an optimal fixed threshold is difficult and is more computationally efficient in all cases. The ability to quickly and accurately identify members of a clone from repertoire sequencing data will greatly improve downstream analyses. Clonally-related sequences cannot be treated independently in statistical models, and clonal partitions are used as the basis for the calculation of diversity metrics, lineage reconstruction and selection analysis. Thus, the spectral clustering-based method here represents an important contribution to repertoire analysis.
机译:MotivationB细胞通过在其表面表达免疫球蛋白(Ig)受体来获得其抗原特异性。这些受体最初是通过DNA的体细胞重组随机产生的,然后通过体细胞超突变过程将抗原激活后进一步多样化,该过程主要是将点置换以高速率引入受体DNA中。下一代测序的最新进展使得能够从血液和组织样本中大规模分析B细胞Ig血统。分析这些数据的关键计算挑战是对序列进行划分以识别共同B细胞(即克隆)的后代。当前的方法使用固定的距离阈值或计算量大的似然计算对序列进行分组。在这里,我们提出了一种基于光谱聚类的自适应阈值确定局部序列邻域的新方法。使用模拟和实验数据集进行的验证表明,与针对这些措施进行了优化的固定阈值相比,该方法具有较高的灵敏度和特异性。此外,该方法适用于难以选择最佳固定阈值的数据集,并且在所有情况下其计算效率均更高。快速准确地从库测序数据中鉴定克隆成员的能力将大大改善下游分析。克隆相关序列不能在统计模型中独立处理,克隆分区被用作计算多样性指标,谱系重建和选择分析的基础。因此,基于谱聚类的方法在这里代表了库分析的重要贡献。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号