首页> 外文学位 >Mining deeper into the proteome: Computational strategies for improving depth and breadth of coverage in high-throughput protein identification studies.
【24h】

Mining deeper into the proteome: Computational strategies for improving depth and breadth of coverage in high-throughput protein identification studies.

机译:深入蛋白质组学:在高通量蛋白质鉴定研究中提高覆盖深度和广度的计算策略。

获取原文
获取原文并翻译 | 示例

摘要

The proteomics field is driven by the need to develop increasingly high-throughput methods for the identification and characterization of proteins. The overall goal of this research is to improve the success rate of modern high-throughput proteomics studies. The focus is on developing computational strategies for increasing the number of identifications as well as improving the ability to distinguish new forms of proteins and peptides. Several studies are presented, addressing different points in the proteomics analysis pipeline. At the most fundamental data analysis level, methods for using modern machine learning algorithms to improve the ability to distinguish correct from incorrect peptide identifications are presented. These techniques have the potential to minimize the need for manual curation of results, providing a significant increase in throughput in addition to increased identification confidence.; Non-standard types of mass spectrometry data are being generated in specific contexts. Specifically, phosphoproteomics often involves the generation of MS3 spectra. These spectra alleviate problems associated with MS2 fragmentation of phosphopeptides, but utilizing the additional information contained in these spectra requires novel informatics. Several strategies for accommodating this additional information are presented. A statistical model is developed for translating the information contained in the coupling of consecutive MS2 and MS3 spectra into a more accurate peptide identification probability score. Also, methods for combining MS2 and MS3 data are explored.; A newer mass spectrometry methodology useful for phosphoproteomics has recently been introduced as well, termed multistage activation (MSA). A comparative study of this and other methods is presented aimed at determining an optimal method for generating phosphopeptide identifications, focusing not only on data analysis techniques, but also on the mass spectrometry methodologies themselves.; A dataset is presented from a differential study of a human cell line infected with the dengue virus. The study explores the complementarity of different fractionation methods in generating more unique protein identifications. A discussion of a statistical mixture model that utilizes relative quantification information to classify identified peptides into two categories based on their membrane topology is given in the final chapter. Finally, a comment on utilizing pI information to enrich for phosphopeptides is provided.
机译:蛋白质组学领域受到对开发用于鉴定和表征蛋白质的高通量方法的需求的驱动。这项研究的总体目标是提高现代高通量蛋白质组学研究的成功率。重点是开发计算策略,以增加识别数量以及提高区分新形式的蛋白质和多肽的能力。提出了一些研究,以解决蛋白质组学分析流程中的不同问题。在最基本的数据分析级别,提出了使用现代机器学习算法提高区分正确肽段识别与错误肽段识别的能力的方法。这些技术有可能使对结果的手动管理的需求降到最低,除了可以提高识别的可信度外,还可以显着提高吞吐量。在特定情况下正在生成非标准类型的质谱数据。具体而言,磷酸化蛋白质组学通常涉及MS3光谱的生成。这些光谱减轻了与磷酸肽的MS2断裂有关的问题,但是利用这些光谱中包含的附加信息需要新颖的信息学。提出了几种用于容纳此附加信息的策略。建立了统计模型,用于将连续MS2和MS3光谱耦合中包含的信息转换为更准确的肽段识别概率评分。此外,还探索了组合MS2和MS3数据的方法。最近也引入了一种新的质谱分析方法,可用于磷酸化蛋白质组学,称为多级激活(MSA)。提出了对此方法和其他方法的比较研究,旨在确定产生磷酸肽鉴定的最佳方法,不仅着眼于数据分析技术,而且着重于质谱法本身。来自对感染登革热病毒的人类细胞系的差异研究提供的数据集。该研究探索了不同分级分离方法在产生更多独特蛋白质鉴定中的互补性。在最后一章中,将对统计混合物模型进行讨论,该模型利用相对定量信息将已识别的肽基于其膜拓扑将其分为两类。最后,提供了有关利用pI信息富集磷酸肽的评论。

著录项

  • 作者

    Ulintz, Peter J.;

  • 作者单位

    University of Michigan.;

  • 授予单位 University of Michigan.;
  • 学科 Biology Molecular.; Biology Biostatistics.; Biology Bioinformatics.
  • 学位 Ph.D.
  • 年度 2008
  • 页码 175 p.
  • 总页数 175
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 分子遗传学;生物数学方法;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号