...
首页> 外文期刊>Frontiers in Genetics >Rare Variants Imputation in Admixed Populations: Comparison Across Reference Panels and Bioinformatics Tools
【24h】

Rare Variants Imputation in Admixed Populations: Comparison Across Reference Panels and Bioinformatics Tools

机译:混合人群中的稀有变异估算:参考面板和生物信息学工具之间的比较

获取原文
           

摘要

Background Imputation has become a standard approach in genome-wide association studies (GWAS) to infer in silico untyped markers. Although feasibility for common variants imputation is well established, we aimed to assess rare and ultra-rare variants’ imputation in an admixed Caribbean Hispanic population (CH). Methods We evaluated imputation accuracy in CH ( N = 1,000), focusing on rare (0.1% ≤ minor allele frequency (MAF) ≤ 1%) and ultra-rare (MAF & 0.1%) variants. We used two reference panels, the Haplotype Reference Consortium (HRC; N = 27,165) and 1000 Genome Project (1000G phase 3; N = 2,504) and multiple phasing (SHAPEIT, Eagle2) and imputation algorithms (IMPUTE2, MACH-Admix). To assess imputation quality, we reported: (a) high-quality variant counts according to imputation tools’ internal indexes (e.g., IMPUTE2 “Info” ≥ 80%). (b) Wilcoxon Signed-Rank Test comparing imputation quality for genotyped variants that were masked and imputed; (c) Cohen’s kappa coefficient to test agreement between imputed and whole-exome sequencing (WES) variants; (d) imputation of G206A mutation in the PSEN1 (ultra-rare in the general population an more frequent in CH) followed by confirmation genotyping. We also tested ancestry proportion (European, African and Native American) against WES-imputation mismatches in a Poisson regression fashion. Results SHAPEIT2 retrieved higher percentage of imputed high-quality variants than Eagle2 (rare: 51.02% vs. 48.60%; ultra-rare 0.66% vs. 0.65%, Wilcoxon p -value & 0.001). SHAPEIT-IMPUTE2 employing HRC outperformed 1000G (64.50% vs. 59.17%; 1.69% vs. 0.75% for high-quality rare and ultra-rare variants, respectively, Wilcoxon p -value & 0.001). SHAPEIT-IMPUTE2 outperformed MaCH-Admix. Compared to 1000G, HRC-imputation retrieved a higher number of high-quality rare and ultra-rare variants, despite showing lower agreement between imputed and WES variants (e.g., rare: 98.86% for HRC vs. 99.02% for 1000G). High Kappa ( K = 0.99) was observed for both reference panels. Twelve G206A mutation carriers were imputed and all validated by confirmation genotyping. African ancestry was associated with higher imputation errors for uncommon and rare variants ( p -value & 1e-05). Conclusion Reference panels with larger numbers of haplotypes can improve imputation quality for rare and ultra-rare variants in admixed populations such as CH. Ethnic composition is an important predictor of imputation accuracy, with higher African ancestry associated with poorer imputation accuracy.
机译:背景推算已成为推断全基因组未分类标记的全基因组关联研究(GWAS)的标准方法。尽管已经普遍确定了常见变体插补的可行性,但我们的目标是评估加勒比西班牙裔混合人群(CH)中稀有和超稀有变体的插补。方法我们评估了CH(N = 1,000)的插补准确性,重点是稀有(0.1%≤次要等位基因频率(MAF)≤1%)和超稀有(MAF <0.1%)变异。我们使用了两个参考面板,即单倍型参考联合会(HRC; N = 27,165)和1000基因组计划(1000G第三阶段; N = 2,504)和多重定相(SHAPEIT,Eagle2)和插补算法(IMPUTE2,MACH-Admix)。为了评估插补质量,我们报告了:(a)根据插补工具的内部指标(例如IMPUTE2“信息”≥80%)进行的高质量变量计数。 (b)Wilcoxon Signed-Rank检验,比较了被掩盖和估算的基因型变异的估算质量; (c)科恩的kappa系数,用于检验估算和全外显子测序(WES)变异之间的一致性; (d)推定PSEN1中的G206A突变(一般人群中很少见,在CH中更为常见),然后进行基因分型。我们还以Poisson回归方式测试了祖先比例(欧洲,非洲和美洲原住民)与WES输入不匹配的关系。结果SHAPEIT2比Eagle2获得更高百分比的推算的高质量变体(稀有:51.02%对48.60%;超稀有0.66%对0.65%,Wilcoxon p值<0.001)。使用HRC的SHAPEIT-IMPUTE2胜过1000G(对于高质量的稀有和超稀有变体,Wilcoxon p值<0.001分别为64.50%对59.17%; 1.69%对0.75%)。 SHAPEIT-IMPUTE2的表现优于MaCH-Admix。与1000G相比,尽管估算和WES变体之间的一致性较低,但HRC输入恢复了更高数量的高质量稀有和超稀有变体(例如,稀有:HRC为98.86%,而1000G为99.02%)。两个参考板均观察到高Kappa(K = 0.99)。推算了十二个G206A突变携带者,并通过确认基因分型对所有携带者进行了验证。非洲血统与罕见和罕见变体的较高插补误差有关(p值<1e-05)。结论具有大量单倍型的参考面板可以改善混合群体(如CH)中稀有和超稀有变异的插补质量。种族构成是插补准确性的重要预测指标,非洲血统越高,插补准确性越差。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号