首页> 外文期刊>Computational Biology and Bioinformatics, IEEE/ACM Transactions on >Integrative Data Analysis of Multi-Platform Cancer Data with a Multimodal Deep Learning Approach
【24h】

Integrative Data Analysis of Multi-Platform Cancer Data with a Multimodal Deep Learning Approach

机译:利用多模式深度学习方法对多平台癌症数据进行整合数据分析

获取原文
获取原文并翻译 | 示例
       

摘要

Identification of cancer subtypes plays an important role in revealing useful insights into disease pathogenesis and advancing personalized therapy. The recent development of high-throughput sequencing technologies has enabled the rapid collection of multi-platform genomic data (e.g., gene expression, miRNA expression, and DNA methylation) for the same set of tumor samples. Although numerous integrative clustering approaches have been developed to analyze cancer data, few of them are particularly designed to exploit both deep intrinsic statistical properties of each input modality and complex cross-modality correlations among multi-platform input data. In this paper, we propose a new machine learning model, called multimodal deep belief network (DBN), to cluster cancer patients from multi-platform observation data. In our integrative clustering framework, relationships among inherent features of each single modality are first encoded into multiple layers of hidden variables, and then a joint latent model is employed to fuse common features derived from multiple input modalities. A practical learning algorithm, called contrastive divergence (CD), is applied to infer the parameters of our multimodal DBN model in an unsupervised manner. Tests on two available cancer datasets show that our integrative data analysis approach can effectively extract a unified representation of latent features to capture both intra- and cross-modality correlations, and identify meaningful disease subtypes from multi-platform cancer data. In addition, our approach can identify key genes and miRNAs that may play distinct roles in the pathogenesis of different cancer subtypes. Among those key miRNAs, we found that the expression level of miR-29a is highly correlated with survival time in ovarian cancer patients. These results indicate that our multimodal DBN based data analysis approach may have practical applications in cancer pathogenesis studies and provide useful guidelines for personali- ed cancer therapy.
机译:癌症亚型的鉴定在揭示有关疾病发病机理的有用见解和推进个性化治疗方面起着重要作用。高通量测序技术的最新发展使得能够快速收集同一组肿瘤样品的多平台基因组数据(例如基因表达,miRNA表达和DNA甲基化)。尽管已经开发了许多综合聚类方法来分析癌症数据,但很少有专门设计用于利用每种输入方式的深层内在统计特性以及多平台输入数据之间复杂的交叉方式相关性的方法。在本文中,我们提出了一种新的机器学习模型,称为多模式深度置信网络(DBN),用于从多平台观察数据中对癌症患者进行聚类。在我们的集成聚类框架中,每个单一模态的固有特征之间的关系首先被编码为多层隐藏变量,然后采用联合潜在模型来融合从多个输入模态得出的共同特征。一种称为“对比散度(CD)”的实用学习算法被用于以无监督的方式推断我们的多峰DBN模型的参数。对两个可用癌症数据集的测试表明,我们的综合数据分析方法可以有效地提取潜在特征的统一表示形式,以捕获模式内和交叉模式相关性,并从多平台癌症数据中识别有意义的疾病亚型。此外,我们的方法可以识别可能在不同癌症亚型的发病机制中发挥不同作用的关键基因和miRNA。在这些关键的miRNA中,我们发现miR-29a的表达水平与卵巢癌患者的生存时间高度相关。这些结果表明,我们基于多峰DBN的数据分析方法可能在癌症发病机理研究中具有实际应用,并为个性化癌症治疗提供了有用的指导。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号