首页> 外文期刊>Journal of mass spectrometry: JMS >Feature selection for OPLS discriminant analysis of cancer tissue lipidomics data
【24h】

Feature selection for OPLS discriminant analysis of cancer tissue lipidomics data

机译:OPLS判别分析癌组织脂素数据的特征选择

获取原文
获取原文并翻译 | 示例
           

摘要

The mass spectrometry-based molecular profiling can be used for better differentiation between normal and cancer tissues and for the detection of neoplastic transformation, which is of great importance for diagnostics of a pathology, prognosis of its evolution trend, and development of a treatment strategy. The aim of the present study is the evaluation of tissue classification approaches based on various data sets derived from the molecular profile of the organic solvent extracts of a tissue. A set of possibilities are considered for the orthogonal projections to latent structures discriminant analysis: all mass spectrometric peaks over 300 counts threshold, subset of peaks selected by ranking with support vector machine algorithm, peaks selected by random forest algorithm, peaks with the statistically significant difference of the intensity determined by the Mann-Whitney U test, peaks identified as lipids, and both identified and significantly different peaks. The best predictive potential is obtained for OPLS-DA model built on nonpolar glycerolipids (Q(2) = 0.64, area under curve [AUC] = 0.95); the second one is OPLS-DA model with lipid peaks selected by random forest algorithm (Q(2) = 0.58, AUC = 0.87). Moreover, models based on particular molecular classes are more preferable from biological point of view, resulting in new explanatory mechanisms of pathophysiology and providing a pathway analysis. Another promising features for OPLS-DA modeling are phosphatidylethanolamines (Q(2) = 0.48, AUC = 0.86).
机译:基于质谱的分子分析可用于正常和癌组织之间的更好分化,并用于检测肿瘤转化,这对于病理学,其演化趋势的预后和治疗战略的发展具有重要意义。本研究的目的是基于来自组织的有机溶剂提取物的分子谱的各种数据集进行组织分类方法的评估。对于潜在结构判别分析的正交投影:所有质谱峰值超过300计数阈值,通过用支持向量机算法排列的峰的子集,随机林算法选择的峰,峰值差异差异由Mann-Whitney U测试确定的强度,鉴定为脂质的峰,均鉴定和显着不同的峰。在非极性甘油脂质(Q(2)= 0.64,曲线下的面积= 0.95)上,获得了最佳预测潜力。第二个是用随机森林算法选择的脂质峰的OPLS-DA模型(Q(2)= 0.58,AUC = 0.87)。此外,从生物学的角度来看,基于特定分子类的模型更优选了病理生理学的新解释机制并提供途径分析。 OPLS-DA模型的另一种有希望的特征是磷脂酰乙醇胺(Q(2)= 0.48,AUC = 0.86)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号