Evaluating feature-selection stability in next-generation proteomics

Wilson Wen Bin Goh; Limsoon Wong

首页> 外文期刊>Journal of Bioinformatics and Computational Biology >Evaluating feature-selection stability in next-generation proteomics

【24h】

Evaluating feature-selection stability in next-generation proteomics

机译：评估下一代蛋白质组学中的特征选择稳定性

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Identifying reproducible yet relevant features is a major challenge in biological research. This is well documented in genomics data. Using a proposed set of three reliability benchmarks, we find that this issue exists also in proteomics for commonly used feature-selection methods, e.g. t-test and recursive feature elimination. Moreover, due to high test variability, selecting the top proteins based on p-value ranks — even when restricted to high-abundance proteins — does not improve reproducibility. Statistical testing based on networks are believed to be more robust, but this does not always hold true: The commonly used hypergeometric enrichment that tests for enrichment of protein subnets performs abysmally due to its dependence on unstable protein pre-selection steps. We demonstrate here for the first time the utility of a novel suite of network-based algorithms called ranked-based network algorithms (RBNAs) on proteomics. These have originally been introduced and tested extensively on genomics data. We show here that they are highly stable, reproducible and select relevant features when applied to proteomics data. It is also evident from these results that use of statistical feature testing on protein expression data should be executed with due caution. Careless use of networks does not resolve poor-performance issues, and can even mislead. We recommend augmenting statistical feature-selection methods with concurrent analysis on stability and reproducibility to improve the quality of the selected features prior to experimental validation.

机译：识别可重复但相关的特征是生物学研究中的主要挑战。这在基因组学数据中有很好的记录。使用拟议的三种可靠性基准测试，我们发现这个问题也存在于常用特征选择方法的蛋白质组学中，例如， T检验和递归特征消除。此外，由于高测试变异性，基于P值等级选择顶部蛋白质 - 即使仅限于高丰度蛋白 - 也不会提高再现性。基于网络的统计测试被认为是更强大的，但这并不总是保持真实：常用的超细富集，用于富集蛋白质子网的富集的测试由于其对不稳定蛋白质预选择步骤的依赖性而进行了自我。我们在这里首次展示了一种新颖的基于网络算法套件的效用，称为基于排名的网络算法（RBNAS）。这些最初已经在基因组学数据中广泛引入和测试。我们在此显示它们是高度稳定的，可重复的，并且在应用于蛋白质组学数据时选择相关的功能。从这些结果中也显而易见的是，应在蛋白质表达数据上使用统计特征测试，应当谨慎执行。粗心使用网络无法解决绩效问题不佳，甚至可以误导。我们建议使用同时分析稳定性和再现性来增强统计特征选择方法，以提高实验验证之前所选特征的质量。

著录项

来源
《Journal of Bioinformatics and Computational Biology》 |2016年第5期|共23页
作者
Wilson Wen Bin Goh; Limsoon Wong;
展开▼
作者单位

1School of Pharmaceutical Science and Technology Tianjin University 92 Weijin Road Tianjin 300072 China;

1School of Pharmaceutical Science and Technology Tianjin University 92 Weijin Road Tianjin 300072 China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类细胞生物学;
关键词
Proteomics; networks; biostatistics; translational research;

机译：蛋白质组学;网络;生物统计学;翻译研究;

相似文献

外文文献
中文文献
专利

1. Evaluating feature-selection stability in next-generation proteomics [J] . Goh Wilson Wen Bin, Wong Limsoon Journal of Bioinformatics and Computational Biology . 2016,第5期

机译：评估下一代蛋白质组学中的特征选择稳定性
2. Next-generation sequencing-based transcriptomic and proteomic analysis of the common reed, Phragmites australis (Poaceae), reveals genes involved in invasiveness and rhizome specificity. (Special Issue: Methods and applications of next-generation sequencing in botany.) [J] . He R. F., Kim M. J., Nelson W., American journal of botany . 2012,第2期

机译：对普通芦苇（禾本科）的新一代测序进行的转录组和蛋白质组学分析揭示了与侵袭性和根茎特异性有关的基因。（特刊：下一代测序在植物学中的方法和应用。）
3. First systematic plant proteomics workshop in botany department, university of Delhi: Transferring proteomics knowledge to next-generation researchers and students [J] . DeswalR., AbatJ.K., SehrawatA., Proteomics . 2014,第13a14期

机译：德里大学植物学系的首个系统植物蛋白质组学研讨会：将蛋白质组学知识传授给下一代研究人员和学生
4. ADVANCES IN CLINICAL PROTEOMICS FOR ANALYSIS OF THYROID FINE NEEDLE ASPIRATION BIOPSIES: EVALUATING PROTEOMIC STABILITY IN PRESERVATIVE SOLUTIONS [C] . Isabella Piga, Giulia Capitoli, Silvia Tettamanti, International Mass Spectrometry Conference . 2018

机译：临床蛋白质组学研究进展，用于分析甲状腺细针穿刺活检：防腐溶液中蛋白质组学稳定性
5. Vision Modeling Tools for Evaluating Next-Generation Displays [D] . Lian, Trisha. 2020

机译：用于评估下一代显示器的视觉建模工具
6. A Novel Next-Generation Sequencing Approach to Detecting Microsatellite Instability and Pan-Tumor Characterization of 1000 Microsatellite Instability–High Cases in 67000 Patient Samples [O] . Sally E. Trabucco, Kyle Gowen, Sophia L. Maund, 2019

机译：一种新的下一代测序方法用于检测67000例患者样品中1000微卫星不稳定性高案的微卫星不稳定性和泛肿瘤表征
7. Exploiting Interdata Relationships in Next-generation Proteomics Analysis [O] . Burcu Vitrinel, Hiromi W.L. Koh, Funda Mujgan Kar, 2019

机译：利用下一代蛋白质组学分析中的interdata关系
8. Evaluation of Water Separation and Filter/Coalescer Impact on JP-8 by Next-Generation +100 Thermal Stability Additives. [R] . D. Davis R. W. Morris 2013

机译：用新一代+100热稳定性添加剂评价水分离和过滤/聚结器对Jp-8的影响。

Evaluating feature-selection stability in next-generation proteomics

摘要

著录项

相似文献

相关主题

期刊订阅