...
首页> 外文期刊>BMC Veterinary Research >A high performance profile-biomarker diagnosis for mass spectral profiles
【24h】

A high performance profile-biomarker diagnosis for mass spectral profiles

机译:质谱图的高性能图谱-生物标志物诊断

获取原文
           

摘要

BackgroundAlthough mass spectrometry based proteomics demonstrates an exciting promise in complex diseases diagnosis, it remains an important research field rather than an applicable clinical routine for its diagnostic accuracy and data reproducibility. Relatively less investigation has been done yet in attaining high-performance proteomic pattern classification compared with the amount of endeavours in enhancing data reproducibility.MethodsIn this study, we present a novel machine learning approach to achieve a clinical level disease diagnosis for mass spectral data. We propose multi-resolution independent component analysis, a novel feature selection algorithm to tackle the large dimensionality of mass spectra, by following our local and global feature selection framework. We also develop high-performance classifiers by embedding multi-resolution independent component analysis in linear discriminant analysis and support vector machines.ResultsOur multi-resolution independent component based support vector machines not only achieve clinical level classification accuracy, but also overcome the weakness in traditional peak-selection based biomarker discovery. In addition to rigorous theoretical analysis, we demonstrate our method’s superiority by comparing it with nine state-of-the-art classification and regression algorithms on six heterogeneous mass spectral profiles.ConclusionsOur work not only suggests an alternative direction from machine learning to accelerate mass spectral proteomic technologies into a clinical routine by treating an input profile as a ‘profile-biomarker’, but also has positive impacts on large scale ‘omics' data mining. Related source codes and data sets can be found at: https://sites.google.com/site/heyaumbioinformatics/home/proteomics
机译:背景技术虽然基于质谱的蛋白质组学在复杂疾病诊断中显示出令人兴奋的希望,但由于其诊断准确性和数据可重复性,它仍然是重要的研究领域,而不是适用的临床常规。与为提高数据重现性所做的努力相比,在实现高性能蛋白质组学模式分类方面的研究相对较少。方法在本研究中,我们提出了一种新颖的机器学习方法,可实现对质谱数据的临床水平的疾病诊断。我们提出了多分辨率独立成分分析方法,这是一种遵循我们局部和全局特征选择框架的新颖特征选择算法,可解决质谱的大维性。我们还通过将多分辨率独立成分分析嵌入线性判别分析和支持向量机中来开发高性能分类器。结果我们基于多分辨率独立成分的支持向量机不仅达到了临床水平的分类准确性,而且还克服了传统色谱峰的缺点选择的生物标志物发现。除了进行严格的理论分析外,我们还通过将其与针对六个异质质谱图的九种最新分类和回归算法进行比较来证明我们方法的优越性。结论我们的工作不仅为机器学习提供了加速质谱的替代方向通过将输入图谱视为“图谱生物标记”,将蛋白质组学技术应用于临床程序,但也对大规模“组学”数据挖掘产生积极影响。相关的源代码和数据集可以在以下位置找到:https://sites.google.com/site/heyaumbioinformatics/home/proteomics

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号