【24h】

Learning in Glaucoma Genetic Risk Assessment

机译:青光眼遗传风险评估中的学习

获取原文

摘要

Genome Wide Association (GWA) studies are powerful tools to identify genes involved in common human diseases, and are becoming increasingly important in genetic epidemiology research. However, the statistical approaches behind GWA studies lack capability in taking into account the possible interactions among genetic markers; and true disease variants may be lost in statistical noise due to high threshold. A typical GWA study reports a few highly suspected signals, e.g. Single-nucleotide polymorphisms (SNPs), which usually account for a tiny portion of overall genetic risks for the disease of interest. This study proposes a computational learning approach in addition to parametric statistical methods along with a filtering mechanism, to build glaucoma genetic risk assessment model. Our data set was obtained from Singapore Malay Eye Study (SiMES), genotyped on Illumina 610quad arrays. We constructed case-control data set with 233 glaucoma and 458 healthy samples. A standard case-control association test was conducted on post-QC dataset with more than 500k SNPs. Genetic profile is constructed using genotype information from a list of 412 SNPs filtered by a relaxed pvalue threshold of 1×10−3, and forms the feature space for learning. Among the five learning algorithms we performed, Support Vector Machines with radial kernel (SVM-radial) achieved the best result, with area under curve (ROC) of 99.4% and accuracy of 95.9%. The result illustrates that, learning approach in post GWAS data analysis is able to accurately assess genetic risk for glaucoma. The approach is more robust and comprehensive than individual SNPs matching method. We will further validate our results in several other data sets obtained in consequential population studies conducted in Singapore.
机译:全基因组协会(GWA)的研究是识别与人类常见疾病有关的基因的有力工具,并且在遗传流行病学研究中正变得越来越重要。但是,GWA研究背后的统计方法缺乏考虑遗传标记之间可能相互作用的能力。由于阈值过高,真正的疾病变体可能会因统计噪声而丢失。一项典型的GWA研究报告了一些高度可疑的信号,例如单核苷酸多态性(SNP),通常占目标疾病总体遗传风险的一小部分。这项研究除了参数统计方法外,还提出了一种计算学习方法以及一种过滤机制,以建立青光眼遗传风险评估模型。我们的数据集来自在Illumina 610quad阵列上进行基因分型的新加坡马来人眼研究(SiMES)。我们用233例青光眼和458例健康样本构建了病例对照数据集。对具有超过500k SNP的QC后数据集进行了标准的病例对照关联测试。利用来自412个SNP列表的基因型信息构建遗传图谱,该列表由1×10 -3 的宽松pvalue阈值过滤,并形成了学习的特征空间。在我们执行的五种学习算法中,具有径向核(SVM-径向)的支持向量机获得了最佳结果,曲线下面积(ROC)为99.4%,准确度为95.9%。结果表明,GWAS后数据分析中的学习方法能够准确评估青光眼的遗传风险。该方法比单个SNP匹配方法更健壮和全面。我们将通过在新加坡进行的相应人口研究获得的其他几个数据集中进一步验证我们的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号