首页> 外文会议>Pacific Symposium on Biocomputing 2001, Jan 3-7, 2001, Mauna Lani, Hawaii >ASSESSMENT AND MANAGEMENT OF SINGLE NUCLEOTIDE POLYMORPHISM GENOTYPE ERRORS IN GENETIC ASSOCIATION ANALYSIS
【24h】

ASSESSMENT AND MANAGEMENT OF SINGLE NUCLEOTIDE POLYMORPHISM GENOTYPE ERRORS IN GENETIC ASSOCIATION ANALYSIS

机译:遗传关联分析中单核苷酸多态性基因型错误的评估与管理

获取原文
获取原文并翻译 | 示例

摘要

Single nucleotide polymorphisms (SNP) may be used in case-control designs to test for association between a marker (the SNP) and a disease. However, such designs usually assume that the genotype data are reported without error. We propose a method, the reduced penetrance model method (RPM) that allows for errors in a case-control design, as compared to the full penetrance model method (FPM), that assumes data are errorless. Pearson's x~2 applied to a 2 x 2 contingency table is the test statistic considered. Additionally, we provide a likelihood method to estimate error rates using SNP genotype data in CEPH pedigrees. We test our method (RPM) against the standard method (FPM) using simulated data. All SNP loci are assumed to have two alleles, coded 1 and 2. We consider three pairs of error rates, two different sample sizes, and two sets of allele frequencies for the SNP locus. SNP genotype data in two populations are simulated under a null hypothesis (allele frequencies equal in both populations) and under an alternative hypothesis (allele frequencies differ between two populations). The total number of simulations is 24; 12 simulations under the null hypothesis, and 12 simulations under the alternative. The significance level threshold is 5%. For the null case, 9/12 (75%) of the simulations show no increase in type I error under RPM, while 3/12 (25%) show a slight increase (rejecting the null for at most 7% of the replicates). There is no increase in the type I error rate for FPM method, which can also be shown analytically. For the alternative case (power), there is a consistent increase in power for the RPM method as compared to FPM method, and average increase of 0.02 for the simulations considered. When sample sizes are large there is virtually no difference in power between RPM and FPM methods. Also, the RPM method provides consistently more accurate allele frequency estimates for the various populations. Our likelihood method to estimate error rates with CEPH pedigrees provides good estimates on average. The largest difference between a true error rate and our average estimated error rate is 0.006. However, there is a fair amount of variability in the estimates, suggesting the need for multiple experiments or larger numbers of CEPH pedigrees. Researchers may use the methods presented in this paper to (1) estimate error rates for their automated genotyping process, and (2) allow for such errors in association analyses, thereby increasing power to detect differences between allele frequencies in case and control populations when errors are present.
机译:单核苷酸多态性(SNP)可用于病例对照设计中,以测试标记物(SNP)与疾病之间的关联。但是,这种设计通常假定基因型数据的报告没有错误。我们提出了一种方法,即简化的外显率模型方法(RPM),该方法允许在案例控制设计中出现错误,而相比之下,完整的外显率模型方法(FPM)则假定数据是无错误的。应用于2 x 2列联表的Pearson's x〜2是考虑的测试统计量。此外,我们提供了一种使用CEPH谱系中SNP基因型数据估算错误率的可能性方法。我们使用模拟数据对照标准方法(FPM)测试方法(RPM)。假定所有SNP位点都有两个等位基因,分别编码为1和2。我们考虑SNP基因座的三对错误率,两个不同的样本量以及两组等位基因频率。在无效假设(两个种群的等位基因频率相等)和替代假设(两个种群之间的等位基因频率不同)下模拟两个种群的SNP基因型数据。模拟总数为24;在原假设下进行12次模拟,在备用假设下进行12次模拟。显着性水平阈值为5%。对于空值情况,在RPM下9/12(75%)的模拟显示I型错误没有增加,而3/12(25%)的模拟显示略有增加(最多重复的7%拒绝空值) 。 FPM方法的I类错误率没有增加,这也可以通过分析来显示。对于替代情况(功率),与FPM方法相比,RPM方法的功率持续增加,对于所考虑的模拟,平均增加0.02。当样本量很大时,RPM和FPM方法之间的功效几乎没有区别。同样,RPM方法为各种人群提供一致更准确的等位基因频率估计。我们用CEPH谱系估计错误率的可能性方法平均可以提供良好的估计。真实错误率和我们的平均估计错误率之间的最大差是0.006。但是,估计中存在相当大的可变性,这表明需要进行多次实验或使用更多的CEPH家谱。研究人员可以使用本文介绍的方法来(1)估计其自动基因分型过程的错误率,并且(2)在关联分析中考虑此类错误,从而提高检测病例和对照组人群等位基因频率之间差异的能力,当出现错误时存在。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号