...
首页> 外文期刊>Proteomics >Scalable Data Analysis in Proteomics and Metabolomics Using BioContainers and Workflows Engines
【24h】

Scalable Data Analysis in Proteomics and Metabolomics Using BioContainers and Workflows Engines

机译:使用生物保护器和工作流发动机的蛋白质组学和代谢组学中的可扩展数据分析

获取原文
获取原文并翻译 | 示例
           

摘要

Abstract The recent improvements in mass spectrometry instruments and new analytical methods are increasing the intersection between proteomics and big data science. In addition, bioinformatics analysis is becoming increasingly complex and convoluted, involving multiple algorithms and tools. A wide variety of methods and software tools have been developed for computational proteomics and metabolomics during recent years, and this trend is likely to continue. However, most of the computational proteomics and metabolomics tools are designed as single‐tiered software application where the analytics tasks cannot be distributed, limiting the scalability and reproducibility of the data analysis. In this paper the key steps of metabolomics and proteomics data processing, including the main tools and software used to perform the data analysis, are summarized. The combination of software containers with workflows environments for large‐scale metabolomics and proteomics analysis is discussed. Finally, a new approach for reproducible and large‐scale data analysis based on BioContainers and two of the most popular workflow environments, Galaxy and Nextflow, is introduced to the proteomics and metabolomics communities.
机译:摘要近期质谱仪和新分析方法的改进正在增加蛋白质组学和大数据科学的交点。此外,生物信息学分析变得越来越复杂和复杂,涉及多种算法和工具。近年来,已经为计算蛋白质组学和代谢组学开发了各种方法和软件工具,这一趋势可能会继续。然而,大多数计算蛋白质组学和代谢组工具被设计为单层软件应用程序,其中无法分发分析任务,限制数据分析的可扩展性和再现性。在本文中,总结了代谢组和蛋白质组学数据处理的关键步骤,包括用于执行数据分析的主要工具和软件。讨论了具有用于大规模代谢组和蛋白质组学分析的工作流环境的软件容器的组合。最后,基于生物容器的可重复和大规模数据分析的新方法和最受欢迎的工作流程环境,Galaxy和NextFlow的两种方法,以蛋白质组学和代谢组织社区引入。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号