首页> 外文学位 >Constructing gene expression based prognostic models to predict recurrence and lymph node metastasis in colon cancer.
【24h】

Constructing gene expression based prognostic models to predict recurrence and lymph node metastasis in colon cancer.

机译:构建基于基因表达的预后模型,以预测结肠癌的复发和淋巴结转移。

获取原文
获取原文并翻译 | 示例

摘要

The main goal of this study is to identify molecular signatures to predict lymph node metastases and recurrence in colon cancer patients. Recent advances in microarray technology facilitated building of accurate molecular classifiers, and in depth understanding of disease mechanisms.;Lymph node metastasis cannot be accurately estimated by morphological assessment. Molecular markers have the potential to improve prognostic accuracy. The first part of our study presents a novel technique to identify molecular markers for predicting stage of the disease based on microarray gene expression data. In the first step, random forests were used for variable selection and a 14-gene signature was identified. In the second step, the genes without differential expression in lymph node negative versus positive tumors were removed from the 14-gene signature, leading to the identification of a 9-gene signature. The lymph node status prediction accuracy of the 9-gene signature on an independent colon cancer dataset (n=17) was 82.3%. Area under curve (AUC) obtained from the time-dependent ROC curves using the 9-gene signature was 0.85 and 0.86 for relapse-free survival and overall survival, respectively. The 9-gene signature significantly stratified patients into low-risk and high-risk groups (log-rank tests, p<0.05, n=73), with distinct relapse-free survival and overall survival. Based on the results, it could be concluded that the 9-gene signature could be used to identify lymph node metastases in patients. We further studied the 9-gene signature using correlation analysis on CGH and RNA expression datasets. It was found that the gene ITGB1 in the 9-gene signature exhibited strong relationship of DNA copy number and gene expression. Furthermore, genome-wide correlation analysis was done on CGH and RNA data, and three or more consecutive genes with significant correlation of DNA copy number and RNA expression were identified. These results might be helpful in identifying the regulators of gene expression.;The second part of the study was focused on identifying molecular signatures for patients at high-risk for recurrence who would benefit from adjuvant chemotherapy. The training set (n=36) consisted of patients who remained disease-free for 5 years and patients who experienced recurrence within 5 years. The remaining patients formed the testing set (n=37). A combinatorial scheme was developed to identify gene signatures predicting colon cancer recurrence. In the first step, preprocessing was done to discard undifferentiated genes and missing values were replaced with k=30 and k=20 using the k-nearest neighbors algorithm. Variable selection using the random forests algorithm was applied to obtain gene subsets. In the second step, InfoGain feature selection technique was used to drop lower ranked genes from the gene subsets based on their association with disease outcome. A 3-gene and a 5-gene signature were identified by this technique based on different missing value replacement methods. Both of the recurrence gene signatures stratified patients into low-risk and high-risk groups (log-rank tests, p<0.05, n=73), with distinct relapse-free survival and overall survival. A recurrence prediction model was built using LWL classifier based on the 3-gene signature with an accuracy of 91.7% on the training set (n=36). Another recurrence prediction model was built using the random tree classifier based on the 5-gene signature with an accuracy of 83.3% on the training set (n=36). The prospective predictions obtained on the testing set using these models will be verified when the follow-up information becomes available in the future. The recurrence prediction accuracies of these gene signatures on independent colon cancer datasets were in the range 72.4% to 88.9%. These prognostic models might be helpful to clinicians in selecting more appropriate treatments for patients who are at high-risk of developing recurrence. When compared over multiple datasets, the 3-gene signature had improved prediction accuracy over the 5-gene signature. The identified lymph node and recurrence gene signatures were validated on rectal cancer data. Time-dependent ROC and Kaplan-Meier analysis were done producing significant results. These results support the fact that the developed prognostic models could be used to identify patients at high-risk of developing recurrence and get an estimate of the survival times in rectal cancer patients.
机译:这项研究的主要目的是确定分子特征,以预测结肠癌患者的淋巴结转移和复发。微阵列技术的最新进展促进了准确分子分类器的建立,并进一步深入了疾病机理。淋巴结转移不能通过形态学评估准确估计。分子标记物具有改善预后准确性的潜力。我们研究的第一部分提出了一种基于微阵列基因表达数据来鉴定可预测疾病阶段的分子标志物的新技术。第一步,将随机森林用于变量选择,并鉴定出14个基因的签名。在第二步中,将淋巴结阴性与阳性肿瘤中无差异表达的基因从14个基因的标记中删除,从而鉴定出9个基因的标记。独立结肠癌数据集(n = 17)上的9个基因信号的淋巴结状态预测准确性为82.3%。使用9个基因的特征从随时间变化的ROC曲线获得的曲线下面积(AUC)分别为无复发生存期和总生存期的0.85和0.86。 9基因标记显着将患者分为低风险和高风险组(对数秩检验,p <0.05,n = 73),具有明显的无复发生存期和总生存期。根据结果​​,可以得出结论,该9基因签名可用于识别患者的淋巴结转移。我们对CGH和RNA表达数据集进行了相关分析,进一步研究了9基因签名。发现具有9个基因标记的基因ITGB1表现出DNA拷贝数与基因表达的强相关性。此外,对CGH和RNA数据进行了全基因组相关性分析,并鉴定了三个或更多个具有DNA拷贝数和RNA表达显着相关性的连续基因。这些结果可能有助于确定基因表达的调节剂。研究的第二部分侧重于为高危复发风险的患者确定分子标志,这些患者将从辅助化疗中受益。训练集(n = 36)由无病5年的患者和5年内复发的患者组成。其余患者组成测试集(n = 37)。开发了组合方案以鉴定预测结肠癌复发的基因特征。在第一步中,进行了预处理以丢弃未分化的基因,并使用k最近邻居算法将缺失值替换为k = 30和k = 20。应用随机森林算法进行变量选择以获得基因子集。第二步,使用InfoGain特征选择技术,根据基因亚群与疾病结果的关联,从基因亚群中删除排名较低的基因。通过不同的缺失值替换方法,通过该技术确定了3基因和5基因签名。这两种复发基因特征均将患者分为低风险和高风险组(对数秩检验,p <0.05,n = 73),具有明显的无复发生存期和总生存期。使用基于3基因签名的LWL分类器建立的复发预测模型,其训练集的准确度为91.7%(n = 36)。使用基于5基因签名的随机树分类器建立了另一个复发预测模型,训练集的准确率达到83.3%(n = 36)。当将来有后续信息时,将验证使用这些模型在测试集上获得的前瞻性预测。这些基因特征在独立结肠癌数据集上的复发预测准确性在72.4%至88.9%的范围内。这些预后模型可能有助于临床医生为处于复发高风险中的患者选择更合适的治疗方法。在多个数据集上进行比较时,与5基因签名相比,3基因签名具有更高的预测准确性。根据直肠癌数据验证已识别的淋巴结和复发基因标记。完成了与时间有关的ROC和Kaplan-Meier分析,产生了显着结果。这些结果支持以下事实:已开发的预后模型可用于识别发生复发的高风险患者,并估计直肠癌患者的生存时间。

著录项

  • 作者

    Mettu, Ramakanth Reddy.;

  • 作者单位

    West Virginia University.;

  • 授予单位 West Virginia University.;
  • 学科 Electrical engineering.
  • 学位 M.S.
  • 年度 2008
  • 页码 140 p.
  • 总页数 140
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号