...
首页> 外文期刊>Proceedings of the National Academy of Sciences of the United States of America >GERBIL: Genotype resolution and block identification using likelihood.
【24h】

GERBIL: Genotype resolution and block identification using likelihood.

机译:GERBIL:基因型解析和使用可能性的区块识别。

获取原文
获取原文并翻译 | 示例
           

摘要

The abundance of genotype data generated by individual and international efforts carries the promise of revolutionizing disease studies and the association of phenotypes with individual polymorphisms. A key challenge is providing an accurate resolution (phasing) of the genotypes into haplotypes. We present here results on a method for genotype phasing in the presence of recombination. Our analysis is based on a stochastic model for recombination-poor regions ("blocks"), in which haplotypes are generated from a small number of core haplotypes, allowing for mutations, rare recombinations, and errors. We formulate genotype resolution and block partitioning as a maximum-likelihood problem and solve it by an expectation-maximization algorithm. The algorithm was implemented in a software package called GERBIL (genotype resolution and block identification using likelihood), which is efficient and simple to use. We tested GERBIL on four large-scale sets of genotypes. It outperformed two state-of-the-art phasing algorithms. The phase algorithm was slightly more accurate than GERBIL when allowed to run with default parameters, but required two orders of magnitude more time. When using comparable running times, GERBIL was consistently more accurate. For data sets with hundreds of genotypes, the time required by phase becomes prohibitive. We conclude that GERBIL has a clear advantage for studies that include many hundreds of genotypes and, in particular, for large-scale disease studies.
机译:由个体和国际努力产生的大量基因型数据带来了革新疾病研究以及将表型与个体多态性联系起来的希望。一个关键的挑战是如何将基因型准确解析(定相为单倍型)。我们在这里提出了在重组存在下进行基因型定相的方法的结果。我们的分析基于重组较差区域(“块”)的随机模型,在该模型中,少数核心单倍型产生了单倍型,允许发生突变,罕见重组和错误。我们将基因型分辨率和块划分公式化为最大似然问题,并通过期望最大化算法解决。该算法在称为GERBIL(基因型分辨率和使用似然性进行块识别)的软件包中实现,该软件包高效且易于使用。我们在四种大型基因型组上测试了GERBIL。它优于两种最先进的调相算法。当允许使用默认参数运行时,相位算法比GERBIL精度更高,但需要更多的时间两个数量级。使用可比的运行时间时,GERBIL始终更加准确。对于具有数百个基因型的数据集,阶段所需的时间变得令人望而却步。我们得出结论,对于包含数百种基因型的研究,特别是对于大规模疾病研究,GERBIL具有明显的优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号