首页> 外文期刊>IEEE/ACM transactions on computational biology and bioinformatics >Deep Collaborative Filtering for Prediction of Disease Genes
【24h】

Deep Collaborative Filtering for Prediction of Disease Genes

机译:深度协同过滤,用于预测疾病基因

获取原文
获取原文并翻译 | 示例
           

摘要

Accurate prioritization of potential disease genes is a fundamental challenge in biomedical research. Various algorithms have been developed to solve such problems. Inductive Matrix Completion (IMC) is one of the most reliable models for its well-established framework and its superior performance in predicting gene-disease associations. However, the IMC method does not hierarchically extract deep features, which might limit the quality of recovery. In this case, the architecture of deep learning, which obtains high-level representations and handles noises and outliers presented in large-scale biological datasets, is introduced into the side information of genes in our Deep Collaborative Filtering (DCF) model. Further, for lack of negative examples, we also exploit Positive-Unlabeled (PU) learning formulation to low-rank matrix completion. Our approach achieves substantially improved performance over other state-of-the-art methods on diseases from the Online Mendelian Inheritance in Man (OMIM) database. Our approach is 10 percent more efficient than standard IMC in detecting a true association, and significantly outperforms other alternatives in terms of the precision-recall metric at the top-k predictions. Moreover, we also validate the disease with no previously known gene associations and newly reported OMIM associations. The experimental results show that DCF is still satisfactory for ranking novel disease phenotypes as well as mining unexplored relationships. The source code and the data are available at https://github.com/xzenglab/DCF.
机译:准确的潜在疾病基因的优先排序是生物医学研究中的根本挑战。已经开发出各种算法来解决这些问题。归纳矩阵完成(IMC)是其既定框架最可靠的模型之一及其在预测基因疾病协会方面的优越性。但是,IMC方法没有分层提取深度特征,这可能会限制恢复的质量。在这种情况下,获得高级表示和处理在大规模生物数据集中呈现的高级表示和处理噪声和异常值的深度学习的体系结构被引入我们深度协同过滤(DCF)模型中基因的侧面信息。此外,对于缺乏否定例子,我们还利用正面解压缩的(PU)学习制定到低秩矩阵完成。我们的方法实现了对来自Man(OMIM)数据库的在线孟德利人继承的其他最先进方法的性能大大提高了性能。我们的方法比标准IMC在检测到真实关联时比标准IMC更高,并且在TOP-K预测的精确召回度量方面显着优于其他替代方案。此外,我们还验证了没有已知的基因关联和新报告的OMIM协会的疾病。实验结果表明,DCF仍然令人满意,用于排名新的疾病表型以及采矿未开发的关系。源代码和数据可在https://github.com/xzenglab/dcf中获得。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号