首页> 美国卫生研究院文献>BMC Bioinformatics >Greedy feature selection for glycan chromatography data with the generalized Dirichlet distribution
【2h】

Greedy feature selection for glycan chromatography data with the generalized Dirichlet distribution

机译:具有广义Dirichlet分布的聚糖色谱数据的贪婪特征选择

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

BackgroundGlycoproteins are involved in a diverse range of biochemical and biological processes. Changes in protein glycosylation are believed to occur in many diseases, particularly during cancer initiation and progression. The identification of biomarkers for human disease states is becoming increasingly important, as early detection is key to improving survival and recovery rates. To this end, the serum glycome has been proposed as a potential source of biomarkers for different types of cancers.High-throughput hydrophilic interaction liquid chromatography (HILIC) technology for glycan analysis allows for the detailed quantification of the glycan content in human serum. However, the experimental data from this analysis is compositional by nature. Compositional data are subject to a constant-sum constraint, which restricts the sample space to a simplex. Statistical analysis of glycan chromatography datasets should account for their unusual mathematical properties.As the volume of glycan HILIC data being produced increases, there is a considerable need for a framework to support appropriate statistical analysis. Proposed here is a methodology for feature selection in compositional data. The principal objective is to provide a template for the analysis of glycan chromatography data that may be used to identify potential glycan biomarkers.
机译:背景糖蛋白参与多种生化和生物过程。据信蛋白质糖基化的变化发生在许多疾病中,尤其是在癌症的发生和发展过程中。对人类疾病状态的生物标记物的识别变得越来越重要,因为早期发现是提高生存率和恢复率的关键。为此,已经提出了血清糖蛋白作为用于不同类型癌症的生物标记物的潜在来源。用于聚糖分析的高通量亲水相互作用液相色谱(HILIC)技术允许对人血清中的聚糖含量进行详细的定量。但是,来自此分析的实验数据本质上是组成成分。成分数据受到常数和约束的约束,该约束将样本空间限制为单纯形。聚糖色谱数据集的统计分析应考虑其不寻常的数学特性。随着产生的聚糖HILIC数据量的增加,迫切需要一个框架来支持适当的统计分析。本文提出了一种在合成数据中进行特征选择的方法。主要目的是提供用于分析聚糖色谱数据的模板,该模板可用于识别潜在的聚糖生物标记。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号