首页> 外文期刊>BMC Medical Genomics >An algorithm for classifying tumors based on genomic aberrations and selecting representative tumor models
【24h】

An algorithm for classifying tumors based on genomic aberrations and selecting representative tumor models

机译:一种基于基因组畸变分类肿瘤并选择代表性肿瘤模型的算法

获取原文
       

摘要

Background Cancer is a heterogeneous disease caused by genomic aberrations and characterized by significant variability in clinical outcomes and response to therapies. Several subtypes of common cancers have been identified based on alterations of individual cancer genes, such as HER2, EGFR, and others. However, cancer is a complex disease driven by the interaction of multiple genes, so the copy number status of individual genes is not sufficient to define cancer subtypes and predict responses to treatments. A classification based on genome-wide copy number patterns would be better suited for this purpose. Method To develop a more comprehensive cancer taxonomy based on genome-wide patterns of copy number abnormalities, we designed an unsupervised classification algorithm that identifies genomic subgroups of tumors. This algorithm is based on a modified genomic Non-negative Matrix Factorization (gNMF) algorithm and includes several additional components, namely a pilot hierarchical clustering procedure to determine the number of clusters, a multiple random initiation scheme, a new stop criterion for the core gNMF, as well as a 10-fold cross-validation stability test for quality assessment. Result We applied our algorithm to identify genomic subgroups of three major cancer types: non-small cell lung carcinoma (NSCLC), colorectal cancer (CRC), and malignant melanoma. High-density SNP array datasets for patient tumors and established cell lines were used to define genomic subclasses of the diseases and identify cell lines representative of each genomic subtype. The algorithm was compared with several traditional clustering methods and showed improved performance. To validate our genomic taxonomy of NSCLC, we correlated the genomic classification with disease outcomes. Overall survival time and time to recurrence were shown to differ significantly between the genomic subtypes. Conclusions We developed an algorithm for cancer classification based on genome-wide patterns of copy number aberrations and demonstrated its superiority to existing clustering methods. The algorithm was applied to define genomic subgroups of three cancer types and identify cell lines representative of these subgroups. Our data enabled the assembly of representative cell line panels for testing drug candidates.
机译:背景技术癌症是一种由基因组畸变引起的异质性疾病,其特征是临床结局和对治疗的反应存在显着差异。基于个体癌症基因(例如HER2,EGFR等)的改变,已经鉴定出几种常见的亚型。然而,癌症是由多个基因相互作用驱动的复杂疾病,因此单个基因的拷贝数状态不足以定义癌症亚型和预测对治疗的反应。基于全基因组拷贝数模式的分类将更适合此目的。方法为了基于拷贝数异常的全基因组模式开发更全面的癌症分类法,我们设计了一种无监督分类算法,该算法可识别肿瘤的基因组亚群。该算法基于改进的基因组非负矩阵分解(gNMF)算法,并包含几个附加组件,即用于确定簇数的试验分层聚类程序,多重随机启动方案,核心gNMF的新停止准则,以及用于质量评估的10倍交叉验证稳定性测试。结果我们将算法应用于三种主要癌症类型的基因组亚群:非小细胞肺癌(NSCLC),结直肠癌(CRC)和恶性黑色素瘤。用于患者肿瘤和已建立细胞系的高密度SNP阵列数据集用于定义疾病的基因组亚类,并鉴定代表每种基因组亚型的细胞系。该算法与几种传统的聚类方法进行了比较,并显示出改进的性能。为了验证我们的非小细胞肺癌的基因组分类学,我们将基因组分类与疾病结果相关联。基因组亚型之间的总生存时间和复发时间显着不同。结论我们开发了一种基于全基因组拷贝数畸变模式的癌症分类算法,并证明了其优于现有聚类方法的优势。该算法用于定义三种癌症类型的基因组亚组,并鉴定代表这些亚组的细胞系。我们的数据使得能够组装代表性的细胞系面板来测试候选药物。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号