首页> 中文期刊> 《生物学杂志》 >样条变换偏最小二乘在肝癌数据分类中的应用

样条变换偏最小二乘在肝癌数据分类中的应用

         

摘要

肝癌是中国最常见的恶性肿瘤之一.基于肿瘤基因表达谱数据的分析与研究是当今研究的热点,对于癌症的早期诊断、治疗具有十分重要的意义.针对高维小样本基因表达谱数据所显现的变量间严重共线性、类别变量与预测变量的非线性关系,采用了基于样条变换的偏最小二乘回归新技术.首先通过筛选法去除基因表达谱数据中的冗余信息,然后以3次B基样条变换实现非线性基因表达谱数据的线性化重构,随后将重构的矩阵交由偏最小二乘法构建类别变量与预测变量间的关系模型.最后,通过对肝癌肿瘤基因表达谱数据的分析,结果显示此分类模型时数据重构稳健,有效的解决了高维小样本基因表达谱数据间的过拟合和变量间的共线性,具有较高的拟合和分类正确率.%Hepatocellular Carcinoma (HCC)is one of the most popular malignant tumors in the world. Recently, the research base on gene expression profile is a hot topic and has strong impact on HCC treatment and diagnosis. Owing to the severe collinearity among variables and the nonlinear relationship between predictor variables and response variables, a novel technology of Partial Least Squares (PLS)base on Spline Transformation (SPLINE-PLS)was adopted. The redundancy in gene expression profile should be eliminated through filter method. Then B-spline function of original non-linear space was transformed into new linear space by using non-linear transformation and the related model between new response variables and predictor variables built with PLS. By analysis of HCC data set, the result showed that this method could yield high accuracy in reconstructing gene data set and overcome the drawback of overfitting and collinearity between variables.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号