...
首页> 外文期刊>Journal of Bioinformatics and Computational Biology >A new genotype calling method for Affymetrix SNP arrays
【24h】

A new genotype calling method for Affymetrix SNP arrays

机译:Affymetrix SNP阵列的新基因型调用方法

获取原文
获取原文并翻译 | 示例
           

摘要

Current genotype-calling methods such as Robust Linear Model with Mahalanobis Distance Classifier (RLMM) and Corrected Robust Linear Model with Maximum Likelihood Classification (CRLMM) provide accurate calling results for Affymetrix Single Nucleotide Polymorphisms (SNP) chips. However, these methods are computationally expensive as they employ preprocess procedures, including chip data normalization and other sophisticated statistical techniques. In the small sample case the accuracy rate may drop significantly. We develop a new genotype calling method for Affymetrix 100 k and 500 k SNP chips. A two-stage classification scheme is proposed to obtain a fast genotype calling algorithm. The first stage uses unsupervised classification to quickly discriminate genotypes with high accuracy for the majority of the SNPs. And the second stage employs a supervised classification method to incorporate allele frequency information either from the HapMap data or from a self-training scheme. Confidence score is provided for every genotype call. The overall performance is shown to be comparable to that of CRLMM as verified by the known gold standard HapMap data and is superior in small sample cases. The new algorithm is computationally simple and standalone in the sense that a self-training scheme can be used without employing any other training data. A package implementing the calling algorithm is freely available at.
机译:当前的基因型调用方法,例如带有Mahalanobis距离分类器(RLMM)的鲁棒线性模型和具有最大似然分类的经过校正的鲁棒线性模型(CRLMM),可为Affymetrix单核苷酸多态性(SNP)芯片提供准确的调用结果。但是,这些方法由于采用预处理程序(包括芯片数据归一化和其他复杂的统计技术)而在计算上昂贵。在小样本情况下,准确率可能会大大下降。我们为Affymetrix 100 k和500 k SNP芯片开发了一种新的基因型调用方法。提出了一种两阶段分类方案来获得快速的基因型调用算法。第一阶段使用无监督分类对大多数SNP快速准确地区分基因型。第二阶段采用监督分类方法,以结合来自HapMap数据或自训练方案的等位基因频率信息。为每个基因型调用提供了置信度得分。已知的黄金标准HapMap数据证明,总体性能可与CRLMM媲美,并且在小样本情况下表现优异。在无需使用任何其他训练数据的情况下可以使用自训练方案的意义上,新算法在计算上简单且独立。可以免费获得实现调用算法的软件包。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号