...
首页> 外文期刊>ACM transactions on intelligent systems >Estimating a Ranked List of Human Genetic Diseases by Associating Phenotype-Gene with Gene-Disease Bipartite Graphs
【24h】

Estimating a Ranked List of Human Genetic Diseases by Associating Phenotype-Gene with Gene-Disease Bipartite Graphs

机译:通过将表型基因与基因疾病二分图相关联来估计人类遗传疾病的排名表

获取原文
获取原文并翻译 | 示例
           

摘要

With vast amounts of medical knowledge available on the Internet, it is becoming increasingly practical to help doctors in clinical diagnostics by suggesting plausible diseases predicted by applying data and text mining technologies. Recently, Genome-Wide Association Studies (GWAS) have proved useful as a method for exploring phenotypic associations with diseases. However, since genetic diseases are difficult to diagnose because of their low prevalence, large number, and broad diversity of symptoms, genetic disease patients are often misdiagnosed or experience long diagnostic delays. In this article, we propose a method for ranking genetic diseases for a set of clinical phenotypes. In this regard, we associate a phenotype-gene bipartite graph (PGBG) with a gene-disease bipartite graph (GDBG) by producing a phenotype-disease bipartite graph (PDBG), and we estimate the candidate weights of diseases. In our approach, all paths from a phenotype to a disease are explored by considering causative genes to assign a weight based on path frequency, and the phenotype is linked to the disease in a new PDBG. We introduce the Bidirectionally induced Importance Weight (BIW) prediction method to PDBG for approximating the weights of the edges of diseases with phenotypes by considering link information from both sides of the bipartite graph. The performance of our system is compared to that of other known related systems by estimating Normalized Discounted Cumulative Gain (NDCG), Mean Average Precision (MAP), and Kendall's tau metrics. Further experiments are conducted with well-known TF.IDF, BM25, and Jenson-Shannon divergence as baselines. The result shows that our proposed method outperforms the known related tool Phenomizer in terms of NDCG@10, NDCG@20, MAP@10, andMAP@20; however, it performs worse than Phenomizer in terms of Kendall's tau-b metric at the top-10 ranks. It also turns out that our proposed method has overall better performance than the baseline methods.
机译:借助Internet上大量的医学知识,通过建议使用数据和文本挖掘技术预测可能出现的疾病来帮助医生进行临床诊断变得越来越实用。最近,基因组广泛关联研究(GWAS)已被证明是探索与疾病表型关联的一种方法。然而,由于遗传病由于其低流行,大量和广泛的症状而难以诊断,因此遗传病患者常常被误诊或经历了较长的诊断延迟。在本文中,我们提出了一种对一组临床表型进行遗传疾病排名的方法。在这方面,我们通过产生表型-疾病二分图(PDBG)来将表型-基因二分图(PGBG)与基因-疾病二分图(GDBG)关联起来,并估计疾病的候选权重。在我们的方法中,通过考虑致病基因以基于路径频率的权重来探索从表型到疾病的所有路径,并且在新的PDBG中表型与疾病相关。我们将双向诱导重要性权重(BIW)预测方法引入PDBG,以通过考虑二分图两侧的链接信息来近似表型疾病边缘的权重。通过估算归一化贴现累积增益(NDCG),平均平均精度(MAP)和Kendall的tau指标,将我们系统的性能与其他已知相关系统的性能进行比较。以著名的TF.IDF,BM25和Jenson-Shannon散度为基准进行了进一步的实验。结果表明,我们提出的方法在NDCG @ 10,NDCG @ 20,MAP @ 10和MAP @ 20方面优于已知的相关工具Phenomizer;但是,就肯德尔的tau-b指标在前十名中而言,它的表现要比Phenomizer差。事实证明,我们提出的方法总体上比基线方法具有更好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号